conll

A collection of CoNLL annotated sentences. For creating new instances of this object, API callers should use the pyconll.load module to abstract over the resource type. The Conll object can be thought of as a simple wrapper around a list of sentences that can be serialized into a CoNLL format.

Conll is a subclass of MutableSequence, so append, reverse, extend, pop, remove, and __iadd__ are available free of charge, even though they are not defined below.

API

Defines the Conll type and the associated parsing and output logic.

class pyconll.unit.conll.Conll(it)[source]

The abstraction for a CoNLL-U file. A CoNLL-U file is more or less just a collection of sentences in order. These sentences can be accessed by sentence id or by numeric index. Note that sentences must be separated by whitespace. CoNLL-U also specifies that the file must end in a new line but that requirement is relaxed here in parsing.

__contains__(other)[source]

Check if the Conll object has this sentence.

Parameters:other – The sentence to check for.
Returns:True if this Sentence is exactly in the Conll object. False, otherwise.
__delitem__(key)[source]

Delete the Sentence corresponding with the given key.

Parameters:key – The info to get the Sentence to delete. Can be the integer position in the file, or a slice.
__getitem__(key)[source]

Index a sentence by key value.

Parameters:key – The key to index the sentence by. This key can either be a numeric key, or a slice.
Returns:The corresponding sentence if the key is an int or the sentences if the key is a slice in the form of another Conll object.
Raises:TypeError – If the key is not an integer or slice.
__init__(it)[source]

Create a CoNLL-U file collection of sentences.

Parameters:it – An iterator of the lines of the CoNLL-U file.
Raises:ParseError – If there is an error constructing the sentences in the iterator.
__iter__()[source]

Allows for iteration over every sentence in the CoNLL-U file.

Yields:An iterator over the sentences in this Conll object.
__len__()[source]

Returns the number of sentences in the CoNLL-U file.

Returns:The size of the CoNLL-U file in sentences.
__setitem__(key, sent)[source]

Set the given location to the Sentence.

Parameters:key – The location in the Conll file to set to the given sentence. This only accepts integer keys and accepts negative indexing.
conll()[source]

Output the Conll object to a CoNLL-U formatted string.

Returns:The CoNLL-U object as a string. This string will end in a newline.
insert(index, value)[source]

Insert the given sentence into the given location.

This function behaves in the same way as python lists insert.

Parameters:
  • index – The numeric index to insert the sentence into.
  • value – The sentence to insert.
write(writable)[source]

Write the Conll object to something that is writable.

For simply writing, this method is more efficient than calling conll then writing since no string of the entire Conll object is created. The final output will include a final newline.

Parameters:writable – The writable object such as a file. Must have a write method.