mirror of
https://github.com/hwchase17/langchain
synced 2024-10-29 17:07:25 +00:00
fd9975dad7
I've added a simple [CoNLL-U](https://universaldependencies.org/format.html) document loader. CoNLL-U is a common format for NLP tasks and is used, for example, in the Universal Dependencies treebank corpora. The loader reads a single file in standard CoNLL-U format and returns a document.
9 lines
400 B
Plaintext
9 lines
400 B
Plaintext
# sent_id = 1
|
|
# text = They buy and sell books.
|
|
1 They they PRON PRP Case=Nom|Number=Plur 2 nsubj 2:nsubj|4:nsubj _
|
|
2 buy buy VERB VBP Number=Plur|Person=3|Tense=Pres 0 root 0:root _
|
|
3 and and CONJ CC _ 4 cc 4:cc _
|
|
4 sell sell VERB VBP Number=Plur|Person=3|Tense=Pres 2 conj 0:root|2:conj _
|
|
5 books book NOUN NNS Number=Plur 2 obj 2:obj|4:obj SpaceAfter=No
|
|
6 . . PUNCT . _ 2 punct 2:punct _
|