I'm looking to parse stories through SyntaxNet but I'm extremely new to this program and databases. https://research.googleblog.com/2016...rlds-most.html

I'm told that, in the end, I'll want it to be in an inverted index.
http://nlp.stanford.edu/IR-book/html...d-index-1.html

To gain the speed benefits of indexing at retrieval time, I'll have to build the index in advance. The major steps in this are:

1. Collect the documents to be indexed:

2. Tokenize the text, turning each document into a list of tokens:

3. Do linguistic preprocessing, producing a list of normalized tokens, which are the indexing terms:

Index the documents that each term occurs in by creating an inverted index, consisting of a dictionary and postings.


My question is if I have to get everything in order before putting the information into a inverted index database, what holds my data while I'm collecting it from thousands of storys? Can I append to the information already inside it, by re-indexing?