Skip to content

Concordance, vocabulary & n-gram #101

@sethwoodworth

Description

@sethwoodworth

from: gitberg-temp/issues/20 @whitten

Is it possible to have some tools that would give you various measures on the text.
Such as a concordance of all the words in the archived text with the location/page number,
As well as n-grams for the work like google does, telling you each word (1-gram), each couple of words (2-gram) etc, with the number of times the word shows up in the work.
I'm sure the tools exist, I'm not sure where though.
I do think the resultant files would be a good addition to the repository for each text.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions