Java Linguistics Software

View 201 business solutions

Browse free open source Java Linguistics Software and projects below. Use the toggles on the left to filter open source Java Linguistics Software by OS, license, language, programming language, and project status.

  • Fully managed relational database service for MySQL, PostgreSQL, and SQL Server Icon
    Fully managed relational database service for MySQL, PostgreSQL, and SQL Server

    Focus on your application, and leave the database to us

    Cloud SQL manages your databases so you don't have to, so your business can run without disruption. It automates all your backups, replication, patches, encryption, and storage capacity increases to give your applications the reliability, scalability, and security they need.
    Try for free
  • Securden Windows Privilege Manager Icon
    Securden Windows Privilege Manager

    For IT security teams

    Remove local administrator rights on Windows servers and endpoints. Seamlessly elevate applications for standard users. Grant time-limited rights on-demand. Control application usage by remote employees through whitelisting and blacklisting.
    Learn More
  • 1
    WordNet Database in various SQL format
    Downloads: 59 This Week
    Last Update:
    See Project
  • 2

    ISO GrAF

    Experimental Java library for reading and writing GrAF/XML files.

    The Graph Annotation Framework (GrAF) models linguistic annotations using a data model based on Graph theory and algorithms. The GrAF standard is a work product of ISO TC37SC4 Working Group 1. This Java library is NOT part of the GrAF standard and standoff annotation files produced by the library may not be GrAF compliant.
    Downloads: 73 This Week
    Last Update:
    See Project
  • 3

    Wordcorr

    Data management for comparative linguistics

    Wordcorr automates the tedious and risky process of tabulating and managing the sound correspondences used in working out the historical development of natural languages. Initial support was from NSF.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 4

    Ghawwas_V4

    An open source system for Arabic corpora processing

    Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character encoding g. Accept TXT, DOC, DOCX, RTF and HTML formats h. Export the processing results in CSV file format
    Downloads: 20 This Week
    Last Update:
    See Project
  • Point of Sale. Powerful and Simple. Icon
    Point of Sale. Powerful and Simple.

    For retail store owners and multi-location retail operations needing a tool to manage sales, inventory, staff and channels in one place

    Vibe Retail is an all-in-one retail point-of-sale and operations platform built for single-store and multi-location retailers seeking to unify inventory, sales, staff and customer data from one mobile-friendly interface. The system lets you track inventory across locations and warehouses, handle item variations (size, color, material), manage purchase orders and supplier deliveries, print custom barcodes, and transfer stock between stores in real time. On the sales side, Vibe supports multiple payment types (cards, cash, checks, gift cards, EBT), layaway workflows, serial number tracking, delivery management, loyalty programs and branded receipts. Retailers can integrate with online platforms (such as Shopify and WooCommerce), sync in-store and online sales, access 40+ real-time reports on sales, inventory and performance, set up promotions and discounts, and print receipts from mobile devices.
    Learn More
  • 5

    sgmweka

    Weka wrapper for the SGM toolkit for text classification and modeling.

    Weka wrapper for the SGM toolkit for text classification and modeling. Provides Sparse Generative Models for scalable and accurate text classification and modeling for use in high-speed and large-scale text mining. Has lower time complexity of classification than comparable software due to inference based on sparse model representation and use of an inverted index. The provided .zip file is in the Weka package format, giving access to text classification. Other functions are usable through either Java command-line commands or class inclusion into Java projects.
    Leader badge
    Downloads: 19 This Week
    Last Update:
    See Project
  • 6
    Korean Analyzer Rhino

    Korean Analyzer Rhino

    Parsing Korean words by morpheme and part-of-speech

    RHINO parses Korean words by morpheme and part-of-speech. Its dictionaries are based on Korean Modern Tagged Corpus(12 million phrases scale) which was made by Korean government. So it analyses many cases of stems and endings. And the newly developed Dynamic Dictionary Technology can make words to react with their context. That is, a programmed database. For more information see the files in the help folder.
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • 7
    srt-translator

    srt-translator

    Subtitle translator from one natural language to other.

    Translating subtitles in format SubRip from one natural language to other. It is based on Google Translate without API and therefore without payment. Translator have automatic and manual spell checkers.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 8
    oopinyinguide
    OO Pinyin Guide is a Java extension for OpenOffice 3 or higher. It enables the user to add pinyin transliteration over Chinese characters inside a text document. This tool can be useful for people learning or teaching Chinese.
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    TXM

    TXM

    Unicode XML TEI text analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP full text search engine (http://cwb.sourceforge.net) and a range of statistical functions (factorial analysis, classification, cooccurrency analysis, etc.) based on R packages (http://www.r-project.org). Read the scientific background at the Textométrie project web site http://textometrie.ens-lyon.fr/?lang=en. Read a full description at the TEI Tools wiki http://wiki.tei-c.org/index.php/TXM.
    Leader badge
    Downloads: 10 This Week
    Last Update:
    See Project
  • PairSoft | AP Automation and Doc Management Icon
    PairSoft | AP Automation and Doc Management

    Free your team from manual processes.

    Streamline operations and elevate your team's efficiency with PairSoft. Our AP automation, procurement, and document management solutions eliminate manual processes, cut costs, and free your team to focus on strategic initiatives. Experience our state-of-the-art invoice-to-pay solution, now integrated with advanced AI technology for faster, smarter results. Our customers report a significant 70% reduction in approval times and annual savings of $62,000 in employee hours. At PairSoft, we aim to transform your business operations through automation. Explore the future of automation at pairsoft.com, where you can leverage cutting-edge features like invoice capture, OCR, and comprehensive AP automation to transform your workflow. Whether you are a small business or a large enterprise, our solutions are designed to scale with your needs, providing robust functionality and ease of use. Join the growing number of businesses that trust PairSoft.
    Learn More
  • 10

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are interested in reuse, and we focus on common NLP tasks that are broadly useful for textmining.
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    LaBB-CAT

    LaBB-CAT

    A linguistic annotation store

    LABB-CAT is a browser-based linguistics research tool that stores recordings and regular-expression searchable text transcripts of interviews. The search results, entire transcripts, and media, can be viewed or exported in a variety of format
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    Entity recognition and normalization software for biomedical text
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Welsh Natural Language Toolkit

    Welsh Natural Language Toolkit

    WNLT is a suite of open source natural language modules for the Welsh

    The project supports the Welsh Language Technology domain with a set of NLP tools that drive innovation and advance the development of sophisticated textual analysis solutions. The WNLT project delivers four core NLP modules; a) Word Segmentation for separating text into words b) Sentence Boundary Disambiguation for finding sentence boundaries c) Part of Speech Tagger for determining the part of speech of each word d) Morphological Analyser for identifying the root form (lemma) of words. The modules are written in JAVA and ‘wrapped’ for execution under the General Architecture for Text Engineering (GATE) framework. The project also includes CYMRIE an adapted version for Welsh of the GATE - ANNIE Named Entity Recognition (NER) application for a range of entities such as Persons, Organisations, Locations, and date and time expressions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    XML-Print

    XML-Print

    XML-Print: typesetting arbitrary XML documents in high quality

    "XML-Print" is a joint project of the FH Worms (Prof. Marc W. Küster) and the University of Trier (Prof. Claudine Moulin) with support from TU Darmstadt (Prof. Andrea Rapp). Its goal is the creation of a XML formatter designated especially for the needs of the “Digital Humanties”. The project is funded by the DFG. Please visit https://sites.google.com/a/budabe.eu/xmlprint_de/kontakt and let us know, what you think about XML-Print – Does it meet your expectations? – What is missing? – Do you use it regularly? Thank you.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    Bermuda Text-to-Speech

    This project includes basic NLP and DSP techniques for Text-to-Speech

    See TTS demo at: http://rslp.racai.ro/index.php?page=tts This is an entirely written in JAVA project which includes a set of tools and methods designed to enable Multilingual Text-to-Speech (TTS) synthesis. We currently support English and Romanian but we will soon train more models and make them available for download. If you want to read more about our other NLP and TTS tools check out http://nlptools.racai.ro.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Colloquium QDA

    Colloquium QDA

    A free and open source qualitative ethnographic interview coding tool.

    Colloquium QDA is a tool for custom coding and analyzing qualitative ethnographic interviews. To run, make sure you first have JRE 8 or later installed (http://www.oracle.com/technetwork/java/javase/downloads/). Colloquium QDA is an open source cross-platform Java Swing app utilizing an embedded Java DB with Lucene integrated search.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Le projet Gramlab vise à mettre à disposition des entreprises des outils logiciels OpenSource et gratuits, qui peuvent être mis en oeuvre par des développeurs qui ne sont pas spécialistes du traitement des langues. Note : L'outil GLabCorpus Manager nécessite l'installation d'un serveur SolR. Pour le télécharger et plus d'information, veuillez vous rendre dans la section Files.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Helsinki Finite-State Technology
    The Helsinki Finite-State Transducer toolkit is intended for processing natural language morphologies. The toolkit is demonstrated by wide-coverage implementations of a number of languages of varying morphological complexity.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19

    Khawas

    An Arabic Corpora Processing Tool

    The new version is available at https://sourceforge.net/projects/ghawwasv4/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20

    Musaheb

    An Arabic collocation extraction tool

    “Musaheb”, an Arabic collocation extraction tool that has been designed and implemented to overcome the limitations of existing collocation extraction tools. “Musaheb” is able to extract n-gram collocations up to 5-gram, in addition to extracting the collocates of the nodes (the word-types we are looking for its collocates) within a window size of zero to 15 words. Moreover, it provides eight collocation statistics to calculate the strength of the collocation, and permits the input of various constraints during node selection and collocate extraction. Based on the user preferences for the node, concordance and collocates selection, the tool saves all nodes and their associated collocates in an XML file; allowing easy conversion to different formats.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21

    NetBeans Dictionaries

    Additional dictionary files for the NetBeans spellchecker.

    Additional dictionary files for the NetBeans spellchecker.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    RoWordNetLib

    RoWordNetLib

    Java API for the Romanian WordNet

    Java API for the Romanian WordNet. Please note that the actual WordNet for Romanian (the XML file containing the network) is not included, it can be obtained (due to its license restrictions) from: http://ws.racai.ro:9191/repository/browse/romanian-wordnet-30/4611a43efb6811e2a8ad00237df3e3580b6b50d1111c4a6292694bded91d5c14/ **** If you would like a direct download of the API containing the RoWordNet xml file, please download them both from : http://www.racai.ro/tools/text/rowordnet/ **** Please cite: Dumitrescu, Ștefan Daniel. RoWordNetLib - The first API for the Romanian WordNet. In Proceedings of the Romanian Academy, Series A (The Publishing House of the Romanian Academy). vol. 16, no. 1, pp. 87--94, 2015 Article PDF at: http://www.acad.ro/sectii2002/proceedings/doc2015-1/12-Dumitrescu.pdf It's free for research use. Use the demo.java as a quick-start guide for using this software.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    iLastic

    iLastic

    Query, integrate and manipulate data using natural languages.

    iLastic is an open-source framework to query, integrate and manipulate any type of data in English. Extract, transform and merge information from the web, databases, files or any other data repository using a language you already know... English
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Annoschemer is a little tool for easy editing of MMAX2 annotationschemes.
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB