conll¶
Note
For loading new Conll
objects from a file or string, prefer the load module which provides the main entry points for parsing CoNLL.
This module represents a CoNLL file, i.e. a collection of CoNLL annotated sentences. Like other collections in python, Conll
objects can be indexed, sliced, iterated, etc (specifically it implements the MutableSequence contract). Conll
objects are Conllable, so then can be converted into a CoNLL string or they can be written to file directly with the write
method.
API¶
Defines the Conll type and the associated parsing and output logic.
-
class
pyconll.unit.conll.
Conll
(it: Iterable[str])[source]¶ The abstraction for a CoNLL-U file. A CoNLL-U file is more or less just a collection of sentences in order. These sentences are accessed by numeric index. Note that sentences must be separated by whitespace. CoNLL-U also specifies that the file must end in a new line but that requirement is relaxed here in parsing.
-
__contains__
(other: object) → bool[source]¶ Check if the Conll object has this sentence.
- Parameters
other – The sentence to check for.
- Returns
True if this Sentence is exactly in the Conll object. False, otherwise.
-
__delitem__
(key: Union[int, slice]) → None[source]¶ Delete the Sentence corresponding with the given key.
- Parameters
key – The info to get the Sentence to delete. Can be the integer position in the file, or a slice.
-
__getitem__
(key: int) → pyconll.unit.sentence.Sentence[source]¶ -
__getitem__
(key: slice) → Conll Index a sentence by key value.
- Parameters
key – The key to index the sentence by. This key can either be a numeric key, or a slice.
- Returns
The corresponding sentence if the key is an int or the sentences if the key is a slice in the form of another Conll object.
- Raises
TypeError – If the key is not an integer or slice.
-
__init__
(it: Iterable[str]) → None[source]¶ Create a CoNLL-U file collection of sentences.
- Parameters
it – An iterator of the lines of the CoNLL-U file.
- Raises
ParseError – If there is an error constructing the sentences in the iterator.
-
__iter__
() → Iterator[pyconll.unit.sentence.Sentence][source]¶ Allows for iteration over every sentence in the CoNLL-U file.
- Yields
An iterator over the sentences in this Conll object.
-
__len__
() → int[source]¶ Returns the number of sentences in the CoNLL-U file.
- Returns
The size of the CoNLL-U file in sentences.
-
__setitem__
(key: int, sent: pyconll.unit.sentence.Sentence) → None[source]¶ -
__setitem__
(key: slice, sents: Iterable[pyconll.unit.sentence.Sentence]) → None Set the given location to the Sentence.
- Parameters
key – The location in the Conll file to set to the given sentence. This accepts integer or slice keys and accepts negative indexing.
item – The item to insert. This can be an individual sentence, or another Conll object.
-
conll
() → str[source]¶ Output the Conll object to a CoNLL-U formatted string.
- Returns
The CoNLL-U object as a string. This string will end in a newline.
- Raises
FormatError – If there are issues converting the sentences to the CoNLL format.
-
insert
(index: int, value: pyconll.unit.sentence.Sentence) → None[source]¶ Insert the given sentence into the given location.
This function behaves in the same way as python lists insert.
- Parameters
index – The numeric index to insert the sentence into.
value – The sentence to insert.
-
write
(writable: Any) → None[source]¶ Write the Conll object to something that is writable.
For file writing, this method is more efficient than calling conll then writing since no string of the entire Conll object is created. The output includes a final newline as detailed in the CoNLL-U specification.
- Parameters
writable – The writable object such as a file. Must have a write method.
-