13 giugno 2024

How to spot if a text has been AI generated

Delve into the realms of terminology tools and discover how they can speed up your research, improve efficiency, and transform content accuracy.” 

Who uses "delve" in "today's world"?

A few days ago I was in one of those mums WhatsApp chat and one of them was promoting her activity online on don't remember what. Thing is that what I remember was my reaction when I spotted immediately that her announcement was written with ChatGPT and I thought "damn no!" "Even here now?" I was ok with wiki documentation written with AI, and this is a life saver to be honest, but even on mums chat nooo, please!

the infamous mom chat

Anyway, I'm not again ChatGPT, I'm instead a big fan of it and an enthusiastic user. I used it to even write my farewell email to my ex colleagues, (just to find out that also my manager used it to write a you-will-be-missed post about me leaving the company) and yes, I'm pretty sure that you would be able to recognise it immediately if a text has been AI generated if you use AI on a daily basis. No need of use any AI detector, (and there are many available online), you can just "sniff" it when you stumble upon those words:

Overuse of “crucial”: Once you recognise its prevalence in AI-generated content, the word “crucial” stands out. While the term can be, well, crucial in certain contexts, keeping an eye out for its reiteration throughout the text can help to spot AI text from human writing.

“Delve, Dive, Discover”: AI loves some particular word patterns. Among these are the use of the verbs “Delve,” “Dive,” and “Discover” followed by “…into the exciting world of x”. It’s a copywriting technique, but after noticing its frequency in AI-generated content, I personally cannot stand it anymore. 

“Unlock”: followed by "the potential of", very common in AI marketing.

“Ensure”: AI uses this word more than any reasonable human ever does.

“In today’s world”: just NO. Sounds like 90's email scam. 


1 giugno 2019

Current Terminology Management Systems: designed to make you think hard

Current TMSs have been developed according to the requirements defined by ISO standards for terminology management systems. Nevertheless, the process of creating a new entry is still very time-consuming.

A couple of weeks ago, I have been contacted because of a post I wrote years ago on this blog (I guess they read this: How would a collaborative platform improve terminology work?). 
Dear Maria, I read your blog article and just wanted to let you know we are building the collaborative terminology system you envisioned at the very moment. It would be a pleasure to get in touch to see if you want to test and send some feedback on our current status!
Well, we scheduled a video call in the following days, during which they showed me the platform in progress and some mockups about a few future features they expected to implement. What I was shown looked very promising indeed - since that old post influenced somebody, I also felt encouraged to write another post about the platform to discuss how I think it would work best. The previous post focused on the collaborative aspect of the tool; this time I chose to focus on the aspects that were missing in the platform demo: mainly, the lack of mandatory fields.
Mandatory fields are crucial for creating a new entry but also very useful to limit the amount of time spent while actually creating a new term entry. Terminology requires a lot of efforts and precision – don’t we know that by now? – but it has to be done, regardless. So my question is: could a platform help us invest less time whilst still allowing us to deliver a good job? Let’s have a look into it.
Current needs of the users (this is real feedback I’ve gathered from seminars and other events I attended):
  • The insertion of terms should take place WHILE translating, considering that the whole translation process is often carried out within a VERY short period of time. After the translation project is completed, terminology work is either forgotten or kicked to the curb for an unknown future.
  • The interface of Terminology Managements Systems currently on the market discourage translators/terminologists forcing them to invest a large part of their time to create term records. And in translation, we know there is no time to waste. The result? Nobody really finds the time for terminology (I heard that one trillion times over).
How can we make the process of term entry creation faster and easier?
Well, by managing the terminology data on the principle of: “as much as necessary and as little as possible”, whilst still complying with the “minimum requirements” set for a terminological entry.
I’ll show you how it could be implemented in few steps below:
· Set the minimum mandatory fields for the entries
· Let the reference/bibliographical resources be automatically filled by the system through automatic metadata insertion
· Have a user-friendly GUI, suitable also for non-expert users.

Here’s a possible solution
Translators/terminologists should be allowed to invest as little time as possible creating entries that comply with all terminological requirements. By setting minimum mandatory data categories, translators can create new entries in a handful of minutes; any optional fields can then be filled at a later stage, reducing the hindrance.
ISO 12620 (Terminology in computer applications – Data categories) lists almost 200 possible data categories for a terminological entry, whilst ISO 12616 considers only three as mandatory: term, term source and entry date.
According to LISA (Localization Industry Standards Association), whenever the Translation Management System is to be used exclusively by humans (not machine processed), it must include either a definition or a context.
Designating certain fields to be mandatory can be problematic. For instance, it may take considerable time and effort to find definitions and contexts, discuss them, and enter them into the Translation Management System. It may be more productive to allow for this information to be added later.

Creating a new Data Entry Interface
The system should guide the user to enter the term first, then the term reference, context, and other Term Level data, then the definition and other Language Level data, and finally the Entry Level data (domains, etc.). Data categories can be designated either in compliance with ISO 12620 or with user-specific needs in mind.
A ‘create new entry’ tab should provide three mandatory fields: term, term source, and entry date. A terminological entry would not be saved if a mandatory field is empty (important for validation). Any remaining data category can be grouped in an ‘Optional data categories’ field.
An auto-save functionality should ensure that data isn’t lost if technical problems occur. A query suggestion system warning that the entry is already in the system (important for preventing duplicates, more details below).

Embedding an external source in reference fields
By inserting the URL of the external source, the system could convert the URL into embedded content. An embedded content is a rich article preview (name of the website, title of the page, etc.). This system is in use in several websites for content curation. The quality of the embedded content depends on the quality of the source website: the better the content is indexed, the more precise will be the preview; otherwise, the user has to edit/correct the data.

So, what is your opinion?
  • Do you use Terminology Managements Systems?
  • Do you hate/love them?
  • What is your favourite tool?
  • What’s favourite functionality?
  • What do you hate the most about them?
  • What function do you really need but is missing in the tools on the market?


Feel free to contact me on Facebook, Twitter, and Instagram to talk about it :D



12 novembre 2018

The new IATE is online. Take a look!

The new version of IATE has a completely renewed look and feel, a more intuitive user interaction and better structured data. Its accessibility has been enhanced (keyboard and screen reader support) and its design is now responsive so that it can be accesses by any device.
Search results are more accurate thanks to a detailed filtered search and a domain filtering. The display of the terminology entries has been improved, including the possibility to select a bilingual, trilingual, or the whole multilingual entry.

As a member of the R&D team of IATE, the European Union terminology database, I’m thrilled to announce the release of the new revamped version of the most popular terminology database😎

We have been working for more than two years at the Translation Centre for the Bodies of the European Union, with the support of the EU institutions, to provide a more functional, modern, user-friendly IATE.

The new IATE has been completely redeveloped with new technologies, a better architecture, a modern design and improved functionalities. 
The data search and indexing are built with Elasticsearch, which allows numerous adaptations because of its open source technology. 
The accuracy of the retrieved results has been increased, and the search has been extended to other fields apart from the term. 
The results page has been improved and offers many more metadata without having to access the full entry. 
The full view of the entries is much more modular, with a bilingual and trilingual view in columns or a multilingual view in the form of a list.
The new IATE offers general statistics in a very visual way, and a user manual with more detailed information than what was offered up to now. 
The new responsive version adapts to different devices (computer, tablet, mobile) and is made accessible for users with motor or visual disabilities (possibility of using the keyboard for navigation and screen readers). 
Finally, we provide a search API (Application Programming Interface) so that IATE can be easily queried from other platforms.

There is much more to come 😍

A second release with advanced data management features for EU terminologists and translatorsis scheduled for late January 2019.

Subsequent releases are planned for 2019 with many more features and improvements.

We have definitely worked hard to make the new IATE better, and we hope that you will enjoy using it and will help us with your suggestions to improve it even more in the future.

Check the new IATE at: https://iate.europa.eu/home



Sources: 

3 agosto 2018

You are doing terminology management all wrong. Here is why

We all know the never-ending, love-hate relationship between translators and terminology… now, let’s explore some of the most common errors.


Generally speaking, when thinking of terminology, we imagine a glossary, made of two parallel columns full of terms, with the source language on one side and the target language on the other.

Easy.

And what better than an Excel file for this type of structure? Seems easy and intuitive enough. Plus, you can also add an extra column to the right, to add comments or other notes.

Well, there’s something wrong here: Excel was never designed to store text, much less terminological data.

Yes, you guessed it… Excel was created to crunch numbers, not words!

Using Excel files is not an effective or efficient way to manage complex databases. If you use it to create glossaries as mentioned above, you will not be able to specify additional attributes for those terms. It is indeed possible to add extra columns but always limited to one field or category for each term. Last but not least, the glossary you create will feature very poor content – namely, only source language on one side and target language on the other.
Using Excel you won’t be able to search or locate all the terms created previously (i.e. in different files). To find a term, you will find yourself guessing n which file it is located. So picture this common situation: you’ve got an unorganized mass of files, divided by project or field, or again a single monolithic file, containing all the terms from your previous translations, but you can’t find anything specific… Nice, uh?

Although Excel is widely used in the translation industry, it is regrettably not the ideal way to manage your terminology accurately. If you choose Excel, it’s because it seems easy to use, fast and easily accessible, right from your Office suite of software. And you can easily exchange files with other colleagues. This method is commonly used, yet it is so because there is a significant lack of terminology management programs available on the market. When they do, they are intended mainly for LSPs, corporations and institutions, but never designed or conceived for freelance translators.

Apart from using the so-overrated Excel, another very frequent mistake that translators make is treating terminology only in the context of a specific text rather than as a single terminological DB that can be enriched with new terms over time.

Translators are by nature careful and scrupulous because their work requires it. But they often have a tendency to manage terminology by opting for quick and painless solutions that, nevertheless, last as long as the translation itself: they are short-term remedies to short-term problems. In my view, when it comes to terminology data management, translators should instead take another route and choose a long-term approach, considering the valuable reuse and exchange of data as a priority in their discipline.

I gave a webinar last June, in collaboration with MateCat, where I provided some key takeaways useful to the freelancer community to deliver quality-based translations, work better and faster.
Around 500 participants attended the webinar and actively interacted by posing tons of questions. It turned out to be such an enjoyable experience.

To know more about my webinar and the next ones by MateCat, pls check their Webinars page.

If you like my T-shirt, you can buy one from my Etsy store :)

Enjoy your holidays!



10 luglio 2018

Terminology is the pinch of salt of translation

Last May I went to beautiful Porto to attend Aptrad’s 2nd International Conference, where I gave my presentation on terminology from a #foodporn perspective 😂. The topic was: terminology VS salting food...

If you think about it, salting food isn’t rocket science, but do you know what “a pinch of salt” actually looks like? How about the right way to sprinkle those crystals or flakes?

When a cooking step is as straightforward as “just add salt,” it’s easy to gloss over.
But since salt is arguably the most important ingredient in the kitchen, it’s worth being 100 percent sure you know exactly how to use it.



The same applies to terminology. Terminology is the pinch of salt of translation. Translators are by nature careful and scrupulous because their work requires it. But they often have a tendency to manage terminology by opting for quick and painless solutions that, nevertheless, last as long as the translation itself: they are short-term remedies to short-term problems.

By this occasion, I just wanted to thank all my dear friends that came to my presentation and supported me and raised a lot of questions. It counted a lot for me and I appreciated it very much.

Talking about #foodporn, espresso in Porto is incredibly GOOD!!! I had the best espresso ever on the Douro Cruise, an unforgettable tasting experience, I took two expressos in a row, sooo good! If you love coffee, Porto is the place to be, you will never be disappointed. And once there, jump on the Douro Cruise for the astonishingly ever-changing landscape.

Again, thank you ApTrad for inviting me, looking forward to seeing you soon!



29 settembre 2017

More than AI, terminology can tell you how something should be translated in the future

Neural machine translation systems offer an opportunity for real progress in the quality of translations produced by machines. However, machine translation still produces unacceptably poor quality content, especially for established brands that (rightly) set a very high bar for their content and brand tone of voice (that can only be set by a good terminology work). 

Given the huge effort underway to vastly improve machine translation, it’ll likely redefine the role of humans in the translation process.
Shouldn't we be looking into ways of making termbases work together with machine translation engines and all the other available CAT-environment tools to contribute quality content? Terminologists need to rise to the challenge of integration with other CAT-environment tools, so that their assets can find their way into the general workflow. This can be achieved only through close cooperation with the developers of technical solutions and by understanding the specific needs of all categories of end user.

Translation memory is already able to facilitate faster human translation, providing translators with words and phrases that have already been translated, but only terminology can tell you how something should be translated in the future. 


5 maggio 2017

Come smettere di essere pagati a parola ma nei nostri termini

Da tempo sui Social come Facebook e Twitter, i traduttori si stanno schierando contro un meccanismo che li spinge ad accettare lavori sottopagati: il pagamento per numero di parole.

Il mercato è pieno di balordi, ma anche noi siamo bravi a rovinare il mercato accontentandoci di essere sottopagati, perché non siamo abili nella negoziazione con il cliente e lasciamo quasi sempre vincere lui, come si diceva oggi alla BP Conference.

Questo è un meccanismo da smantellare. Mortifica il nostro lavoro e riduce il potenziale del traduttore. Ragioniamo in termini di tempo. Per tradurre 300 parole, abbiamo bisogno di 30 minuti. Queste 300 parole possono essere tradotte in 30 minuti, ma anche in un’ora. Dipende dall’argomento. In alcuni casi c’è bisogno di approfondire, seguire blog del settore, leggere articoli, consultare riferimenti vari, chiedere consigli a esperti.

Il punto è che questo: il cliente lo deve sapere, fa parte della nostra professionalità. 300 parole nate dopo una fase di studio durata 2 giorni, non hanno lo stesso valore di 300 parole tradotte 30 minuti. No. Quelle parole valgono quelle ore di approfondimento.

E quelle ore di apprendimento si ammortizzano perché utilizzeremo quella ricerca terminologica (che convergerà nei glossari) per altre traduzioni dello stesso argomento, e in quel caso saremo più veloci, perché avremo acquisito esperienza e competenza in quell'ambito.

E allora, iniziamo a educare i clienti e le agenzie, rendiamo consapevole il cliente delle difficoltà intrinseche nella traduzione del testo, del rischio (anche economico) di una traduzione sbagliata e di una scelta di termini non corretta (alla BP conference si faceva riferimento anche alla possibiltà di redigere un contratto con il cliente). 

Iniziamo a liberarci della tariffa a parola iniziando a puntare su di noi, sul valore e sulla qualità della nostra traduzione, sul nostro tempo, sul processo decisionale dietro alla scelta del termine giusto, in poche parole, sulla qualità del nostro lavoro. Noi non lavoriamo semplicemente con le parole, noi gestiamo un patrimonio immateriale: la conoscenza.



How to spot if a text has been AI generated

Delve into the realms of terminology tools and discover how they can speed up your research, improve efficiency, and transform content accur...