21 gennaio 2016

Corpus Italiano: corpus of contemporary Italian texts from the web

This corpus of contemporary Italian texts from the web was created in the context of the project PAISÀ with the aim to provide a large resource of freely available Italian texts for language learning by studying authentic text materials.

It constitutes a unique language resource for Italian in combining the following features: corpus of web texts (harvested in September/October 2010) composed entirely of freely available and freely distributable texts.

Even though primarily created for language learning, the corpus also provides a rich resource for researchers and translators. The interface will offer different modes for accessing the corpus, ranging from precompiled searches to fully flexible search options for constructing complex queries, aiming to serve different user groups.

For more detailed information, please check: Corpus Italiano


Accademia della Crusca offers an exhaustive list of databases, corpora and historical documents;

Banche dati, corpora e archivi testuali - Treccani

11 gennaio 2016

Extract terms from a URL



This tool extracts the terms from one English web page retrieved from your URL. Some extracted and translated terms may not be relevant to you. If requested, the terms can be machine translated and stored in a monolingual or bilingual output format. The translation of these terms happens out of context, so most likely they need to be checked and corrected before using them in a translation production environment.
Technical Services & Management for the Translation Industry:




'via Blog this'

The Monco corpus search engine

The Monco corpus search engine: "Language changes as we speak. New words and new senses of familiar words are coined and recorded in dictionaries every year. Daily frequencies of 'content words' vary immensely as they are chosen to report events in the media. Words such as ‘vape’, ‘hangry’ or ‘emoji’ are either heavily under-represented or not present at all in reference corpora of English which were compiled only a few years ago. Also, within days, frequencies of words such as ‘migrant’ or ‘refugee’ may become relatively higher than ever before. Monco can help you keep track of such variation."



'via Blog this'

Inclusive GIT branch naming

“main” branch is used to avoid naming like “master” and  “slaves” branches “feature branch” for new feature or bug fix   The shift fr...