We provide middleware for applications that need to analyze, transform or mine natural language data.
Our products “shred” content into a list of numbers each corresponding to a particular concept behind a word or an idiom. The concepts are semantically interconnected. In addition to the usual phone numbers, emails, locations, addresses, and other kinds of recognizable patterns, the flexible architecture of our products offers a choice of over 100,000 entities (closer to 120,000). It is possible to find, for example, names of illicit drugs (including street names), or locations in US, or components of explosive devices, or nuclear physics related terms, or words with British spelling. The same applies to the domains of discourse. Once the text is analyzed, it is possible to generate a text of equivalent meaning in any other language in the database, or the same language, altering properties such as style, measures & metrics, etc.
Thorough text analysis requires much more than simple string matching. For example, the word “nice” in a sentence like ‘Nice to see you here’ is not a city, while ‘Nice is a great place to relax’ does mention a city. Our products are able to distinguish between different meanings of the same word.
Why DigitalSonata.com It Might Be A Killer
Unstructured data & text analytics seem to gain momentum, and the integrators / corporate developers always get better deals. Creating solution from scratch is not viable and expensive, and turnkey solutions might not be flexible enough.
1. Accuracy.
2. Abstraction and community involvement. It is possible to build a MT engine, or text mining engine without our involvement, using GUI management tools.
3. Powerful functionality. Extract domains, regional use, concepts, transliterate, etc.
4. Ease of use. You can embed it in your application without knowing anything about natural language processing. Simply feed the text, and get a series of programmatically friendly codes.







