All  

Store Banner Desktop

Store Banner Mobile

Linear A incised on tablets found in Akrotiri, Santorini

Linear A and The Machine: a Brute Force Attack to Decrypt the Minoan Code

Print
Getting your Trinity Audio player ready...

This is the story of a quest. A story which stems from a stream of failures and some gleams of perseverance, where countless attempts clash with repeated falls, a story which exists because of people who gave up and other individuals who took the baton to continue the quest. It is a story which, in itself, dates back to more than one hundred years ago, contemplates some exciting moments and bitter disappointments, and falls beneath a landslide of resignation and powerlessness before starting anew again.

This is a story of Linear A. Not its complete history – there are books and academic papers, for that (e.g., Robinson, passim; Perono Cacciafoco, 154-170) –, but the story of one of its last ongoing decipherment attempts, born out of the almost absolute certainty of its undecipherability. Therefore, this is a story on Linear A. 

The Rediscovery of ‘Linear A’

Linear A was (re-)discovered by Sir Arthur Evans at the beginning of the 20th century. After more than one hundred years and many failed decipherments, a considerable number of scholars lost all hope that, one day, we will ever be able to interpret its puzzle. Some other individuals still work patiently on it, trying to unravel its tangle.

At the level of definition, Linear A is, indeed, a currently undeciphered writing system from Bronze Age Crete (specifically from the New Palace Period), in the Aegean Sea. It ‘hides’ the so-called Minoan language, which, at the moment, cannot be read. The script was used by the Minoans (possibly a pre-Greek civilization) approximately between 1800 BC and 1450 BC.

When the Mycenaeans (who were an Indo-European – Greek – people coming from continental Greece) invaded Crete between the 15th and 14th centuries BC, they used Linear A as a model for their syllabic writing system, Linear B, which transcribed their Indo-European language, Mycenaean Greek.

Arthur Evans, who was excavating the Minoan palace of Knossos in 1900, unearthed, in a relatively short time, around 3000 clay tablets carrying inscribed symbols (i.e., writing) and was the scholar who named the two main scripts found in Crete (besides the so-called ‘Cretan Hieroglyphs’) ‘Linear A’ and ‘Linear B’ (apparently because they are mainly composed of lines and are linearly written) (Evans A, 327-395; Evans B, passim).

Linear A is attested in clay tablets, votive tables and offering vessels in stone, ceramic fragments, and silver and golden hairpins and roundels (Petrolito, Petrolito, Perono Cacciafoco, and Winterstein, 95-104). The script is composed of possible syllabic signs, combined characters with ligatures, possible ideograms, possible numerals and measurement symbols (Salgarella, s.v. Linear A).

Linear A tablet from the palace of Zakros, Archeological Museum of Sitia.

Linear A tablet from the palace of Zakros, Archeological Museum of Sitia. (Olaf Tausch /CC BY 3.0)

While Linear B was deciphered by Michael Ventris in 1952, in one of the most exciting discoveries in Philology and Linguistics of last century (and a testament to human ingenuity) (Chadwick, passim), Linear A remains undeciphered. All decipherment attempts failed, so far, and the hope to read the Minoan language is slowly fading (Perono Cacciafoco, 154-170).

What We Have To Work With

I started to work on Linear A a long time ago, when I was at the University of Pisa, in Italy. To learn, I reproduced some of the previous decipherment attempts and also tried to follow some new leads, looking for a possible path. To no avail, unfortunately.  

It is not an exaggeration to say that Linear A provides its aspiring glyph-breakers with unique, possibly unsurmountable challenges. The corpus is not small enough to automatically mean that the writing system will never be deciphered. However, it is not big enough to provide consistent data and materials to make anyone optimistic about a possible success in its interpretation.

The fact that many symbols are shared with the deciphered (and, therefore, readable) Linear B should be an added value, an ‘asset’. Unfortunately, the ‘natural’ operation of reading Linear A through the phonetic values of possibly corresponding syllabograms from Linear B leads nowhere and is, probably, one of the reasons of many misunderstandings generated by a number of interpretations from the past.

That said, there is not much that can be done to improve approaches and methods. If Linear A transcribes an isolate language (a language for which we cannot find a linguistic family and/or possibly related languages), no matter how we try to read its symbols and the phonetic values that we might ascribe to them, we will never be able to decipher it.

This option is, fortunately, quite unlikely. It is very difficult (and, probably, it would even be methodologically wrong) to think that Minoan had no connections with any other contemporary language spoken, in its times, in the Aegean and Mediterranean areas and surrounding territories. Languages, indeed, are never ‘isolated’ in themselves. They become that way because they get more and more unknown and undocumented over centuries and millennia and, due to that, we are not able to historically reconstruct their mutual relationships and their origins anymore. Our ignorance of their ‘kinship’, therefore, does not mean that they do not belong to a language family.

Linear A tablet, Chania Archaeological Museum.

Linear A tablet, Chania Archaeological Museum. (Ursus/CC BY-SA 3.0)

Getting to Grips With the Script

Some years ago, I landed an academic job at Nanyang Technological University (NTU), in Singapore. At the time, I was still working on Linear A at the documentary level and as a part of my teaching, but I was not actively trying to decipher it anymore.

However, my story with the interpretation of that ancient script from Crete was not yet completely over. At NTU, I was teaching a Course in the History of Cryptography which included a couple of classes on Language Deciphering. Those classes were often my students’ favorites, the ones getting more focus, discussions, and questions. Through them, those students discovered their passion for Classics, at the point of wanting to learn Ancient Greek and Latin and how to read Linear B. From there, they started to talk about the decipherment of Linear A. Interest was reignited on the impenetrable writing system.

A research grant on Linear A followed, eventually, and sixteen research assistants were hired among those students. Despite the enthusiasm, we knew that, no matter what, we would have had biases which would have led to possibly unsolvable mistakes. We would have thought, like many of the scholars before us, to ‘know’ – or to be able to ‘guess’ – the possible language ‘hidden’ behind Linear A. As a consequence, we would have plausibly tried to linguistically demonstrate our unavoidably arbitrary assumptions, and that would have led us very far from our goal.

Before even starting, therefore, the team would have had to find a way to ‘remove’ all kinds of baseless postulations, groundless intuitions, and risky hypotheses, a way to impartially investigate all possible options (no matter how unlikely), independently of their ‘mainstream’ or ‘niche’ nature.

Linear A inscription on a clay tablet from Crete, probably 15th century BC. Archaeological Museum of Heraklion.

Linear A inscription on a clay tablet from Crete, probably 15th century BC. Archaeological Museum of Heraklion. (Zde/CC BY-SA 4.0)

How Could Brute Force Crack This?

Cryptology, which is the association of Cryptography (the art of inventing, developing, and implementing codes, ciphers, and crypto-systems) with Cryptanalysis (the art of deciphering, decrypting, and reverse-engineering codes, ciphers, and crypto-systems), came to the rescue, by providing us with an idea for the possibly most unbiased procedure that one can apply while trying to ‘crack’ a crypto-system: the so-called ‘brute force attack’.

In Cryptanalysis, a ‘brute force attack’ basically consists of attributing to each digit (which can be a letter, a number, a character, a symbol, etc.) of a cipher theoretically all its (reasonably) possible values and to continue that way until parts of the text (segments, words, clusters, phrases, etc.) become apparent and readable, in other words until those parts make sense, leading the cryptanalysts from an incomprehensible cipher-text to a (relatively) clear and rationally readable plain-text.

A procedure like that needs enough text to work on, to be able to recognize clues and ‘cribs’ and to perform frequency and pattern analyses (and, possibly, some advanced statistical calculations), a considerable amount of time, unwavering patience, strong intellectual skills, and, obviously, luck. That is necessary to find what I call ‘lucky matches’, possible segments of actual text which allow to ‘crack’ the whole system, generating a sort of ‘rain effect’.

Sometimes, when the text and the related data are too large, a ‘brute force attack’ is not convenient (or possible), because the amount of time and effort required to perform it overcomes the possible benefits of the decryption. However, when applicable and well-developed, a ‘brute force attack’ can effectively ‘crack’ a code when its (cipher-)key is unknown.

The Benefits of Brute Force

A procedure like that is not new, in Language Deciphering. Undeciphered writing systems are, after all, unintended crypto-systems. Originally developed to facilitate (written) communication, their knowledge is lost over centuries and millennia, making their characters the equivalent of digits of the cipher-text of a cipher. The idea of trying to decode an undeciphered script by cryptanalytic means, therefore, represents a quite reasonable approach.

Michael Ventris, for instance, used a ‘brute force’ system of grids, while trying to decipher Linear B, which was derived by techniques he learned when he was presumably a cryptanalyst in the Royal Air Force and which, thanks to his genial intuitions, led to the decipherment of the script (Chadwick, passim).

Ventris systematically attributed apparently random phonetic values to the Linear B symbols (which he knew were syllabograms thanks to the previous impressive work on the writing system developed by Alice Kober), compiling a potentially endless system of grids. Then, he applied to some symbols relatively arbitrary assumptions (like the – correct – values of the sounds A for a specific phonogram and N-I for a specific syllabogram) and was able to find his ‘lucky match’, the cluster of characters transcribing the place name Amnisos ( Ἀμνισός, the port of Knossos), syllabically A-MI-NI-SO.

Once he had those (correct) syllabic values, a ‘rain effect’ was triggered and, by using the grids and by attributing those values to the corresponding syllabograms in other clusters, he was able to start reading Linear B, word by word – the language ‘hidden’ behind the script revealed itself to be an archaic variant of Greek (Mycenaean Greek, indeed), spoken and transcribed way before the introduction in Greece of the ‘classical’ Ancient Greek alphabet (derived from the Phoenician alphabet), in an extraordinary discovery which opened our eyes on the secrets of the ancient Mycenaean civilization and society.

Ventris did not think that Linear B was transcribing Ancient Greek almost until the last moments of its decipherment. However, the prevalently unbiased procedure he applied helped him not to fall victim of any misconception.

Based on that experience, our team’s idea was to develop a completely ‘aseptic’ method to deal with Linear A. What we wanted to do, indeed, was to ‘attack’ its symbols with all possible phonetic values from all possibly compatible, at the geographical and chronological level, languages from the Mediterranean and surrounding areas. A daunting task, quantitatively (because of the number of possible languages and phonetic values involved) and intellectually (because of the huge and all-embracing linguistic knowledge and expertise required).

A computer had to help. We needed people with strong coding expertise. And a lot of knowledge of ancient languages potentially connected with the elusive Minoan. We informally called our computer and the software we developed to run on it ‘The Machine’ (Loh, passim).

How ‘Linear A’ Met ‘The Machine’

Despite how ‘specialized’ all this may look, the underlying idea was quite basic. Take Linear A, all the documents, tablets, fragments, tables, seals / tokens, etc. A corpus with transcriptions is already available ( GORILA, passim). Take that into account, but transcribe again everything, document by document, symbol by symbol.

A UNICODE font is available, but you need something more accurate, which should include possible ligatures and graphical variants. Develop your own non-UNICODE font on Linear A, for precise transcriptions, then.

Transliterate all the Linear A text you can through Linear B phonetic values (when Linear A and Linear B syllabograms apparently coincide), to allow for a rudimentary (and potential) reading of the Linear A text in itself.

Compile spreadsheets with all the transcriptions (through your Linear A font) and transliterations (through your ‘Linear B readings’).

Take lexicons of ancient languages (and proto-languages) which might be compared to Minoan independently of a specific language family (like Indo-European, Semitic, Afro-Asiatic, etc.), e.g., Ancient Egyptian, Hamito-Semitic, Hittite, Luwian, etc., and compile spreadsheets with the transcriptions of each single lemma, in a fashion that can overlap with the Linear A transcriptions. Do the same with more ‘unlikely’ languages, proto-languages, and language families (e.g., Albanian, [Proto-]Celtic, Uralic, Basque, etc.), which will be used as ‘calibration tests’.

Write a computational program which can read and process the different spreadsheets, producing exact comparisons among them within seconds. The program can take a specific lexicon (e.g., Ancient Egyptian) and run it against the entire Linear A corpus (your transcriptions and transliterations), finding possible relevant matches (words / phonetic clusters). The program, then, compiles a new spreadsheet with all the possible matches resulting from the comparison, transcribed one by one, with the indications of the documents where they have been found, their positions (lines, segments) in every specific document, their frequency in the corpus, their transcriptions with your non-UNICODE Linear A font, potential IPA (International Phonetic Alphabet) transcriptions, and some statistical details. A semantic list, derived by the lexicons’ transcriptions, is automatically added, language by language.

A procedure like that, if performed by hand, would require years and painstaking double-checking. ‘The Machine’, conversely, can complete a corpus comparison in some seconds. Once the results are available, the ‘human moment’ starts, with the members of the research team who have to analyze the spreadsheets with the results, to ascertain whether the matches highlighted by ‘The Machine’ are reasonable and possible or just a product of ‘chance’ and irrelevant and whether they can be linguistically significant (in practice, if they can be actual, meaningful words in the context of a single tablet and of the corpus in its entirety).

Through this procedure, the chances of highlighting actual ‘lucky matches’ increase exponentially, and that could potentially lead to the decipherment of Linear A based on the comparison with a possibly compatible (deciphered) language.

The process is unbiased, because no preconceived idea about the underlying language is allowed and considered. The chosen lexicons belong to all possibly compatible languages. As mentioned, even unlikely languages and proto-languages, like Basque and (Proto-)Celtic, are tested, to improve the functionality of ‘The Machine’ and to allow for statistical comparisons (Nepal and Perono Cacciafoco, 1-13).

Obstacles with the Approach

Issues and weaknesses, nonetheless, are always surfacing. The main problem is represented by the ‘Linear B approach’. To read Linear A through the possibly corresponding Linear B phonetic values has shown, over time, to be an ineffective or, at least, highly flawed strategy. However, the Linear B phonetic values are, at the moment, the only possible ‘approximation’ for a tentative reading of Linear A. In other words, without them it is impossible to proceed.

If the Mycenaeans had not only borrowed the Minoan writing system, but also used it to transcribe phonetic values which were similar between Minoan and Mycenaean with the same syllabograms, this approach can be meaningful. If, conversely, the similarity between Linear A and Linear B, due to borrowing, is only ‘graphical’, then we have no reasonable ways to read Linear A in itself.

Another problem is connected with the analysis performed by the researchers on the results produced by ‘The Machine’. There is not a ‘rule’ (apart from common sense) to decide which ‘cluster’ (i.e., possible word) highlighted by the program can be significant or a possible ‘lucky match’. Even a high frequency in the Linear A corpus would not imply the correctness of the interpretation and relevance of a lexical item in a specific language derived from a targeted comparison.

In that case, therefore, the arbitrariness, although we try to avoid it at all costs, becomes a part of the interpretation process (Eu Min, Xu, and Perono Cacciafoco, 44-48; Loh and Perono Cacciafoco, 927-943). Moreover, if Minoan is really an isolate language or if no linguistic connections exist between it and any of the tested languages, the comparisons cannot produce any meaningful and useful finding.

Independently of that, however, the systematic comparisons of our corpora generate plenty of potentially significant results, at least at the level of frequency analysis and pattern recognition, and provide us with a massive amount of data that traditional comparisons cannot quantitatively match.

Using The Machine Moving Forward

‘The Machine’ is an intuitive and user-friendly program which ultimately allows all kinds of macro-comparisons aimed at cryptanalytic analyses. The program works exactly as described above. The researchers ‘feed’ it with the Linear A transcriptions and a ‘target-lexicon’ and, in some seconds, all the possible matches are highlighted in a separate document, with transcriptions, transliterations, details on clusters’ origins and positions, frequency analysis, and statistics. Then the researchers start to analyze the findings and to check and double-check the results tablet by tablet, document by document, wherever they appear.

Due to the large amount of data and the different fields of expertise involved, the work is still in progress. Quite recently, I moved from NTU to Xi’an Jiaotong-Liverpool University (XJTLU), in Suzhou (Jiangsu), China, but the project is still regularly ongoing and is based on solid foundations.

The team completed ‘The Machine’, which was made available, open-source, to all the scholars interested in using it for all kinds of macro-comparisons (not necessarily in the field of Language Deciphering). Our font was effectively developed, as well as all the transcriptions and transliterations. Validation tests were implemented and the macro-comparisons started. While the process naturally needs constant commitment and continuous enhancements in every single procedure, it is now time to make the device and its tools run and work systematically.

People waiting for news on Linear A, therefore, should stay tuned. It is possible some surprises will come soon.

Top image: Linear A incised on tablets found in Akrotiri, Santorini.          Source: Portum/ CC BY-SA 3.0

By Francesco Perono Cacciafoco

References

Chadwick, John. 2014. The Decipherment of Linear B. 2 nd Edition. Cambridge: Cambridge University Press. [Print]

Godart, Louis, and Olivier, Jean-Pierre. 1976-1985. Recueil des inscriptions en lineaire A (Collections of Inscriptions in Linear A) - GORILA. 5 Volumes. Paris: École française d’Athènes. [Print]

Eu Min, Niki Cassandra, Xu, Duoduo, and Perono Cacciafoco, Francesco. 2019. Coding to Decipher Linear A. IEEE eXpress Conference Publishing (IEEE Catalog Number: CFP19M10-ART). In Proceedings of the 2019 Pacific Neighborhood Consortium Annual Conference and Joint Meetings (PNC) - Regionality and Digital Humanities: South-South Connections, 15-18 October 2019, Nanyang Technological University (NTU), Singapore: 44-48. Available online: https://ieeexplore.ieee.org/document/8939625.

Evans, Arthur J. 1897. Further Discoveries of Cretan and Aegean Script: With Libyan and Proto-Egyptian Comparisons. Journal of Hellenic Studies, XVII: 327-395 - Evans A. Available online: https://www.jstor.org/stable/623835.

Evans, Arthur J. 1909. Scripta Minoa. Oxford: Clarendon Press - Evans B. [Print]

Loh, Jia Sheng Colin. 2021. Linear A Decipherment Programme. Available online: https://github.com/L-Colin/Linear-A-decipherment-programme.

Loh, Jia Sheng Colin, and Perono Cacciafoco, Francesco. 2021 . A New Approach to the Decipherment of Linear A: Coding to Decipher Linear A, Stage 2 (Cryptanalysis and Language Deciphering: A ‘Brute Force Attack’ on an Undeciphered Writing System). In Grapholinguistics in the 21 st Century: /gʁafematik/ June 15-17, 2020 - Proceedings, 2020, Part II. Brest: Fluxus Editions: 927-943. Available online: https://www.fluxus-editions.fr/gla5-cacc.pdf.

Nepal, Aaradh, and Perono Cacciafoco, Francesco. 2024. Minoan Cryptanalysis: Computational Approaches to Deciphering Linear A and Assessing Its Connections with Language Families from the Mediterranean and the Black Sea Areas. Information, 15, 2, 73: 1-13. Available online: https://www.mdpi.com/2078-2489/15/2/73.

Perono Cacciafoco, Francesco. 2017. Linear A and Minoan: Some New Old Questions. Annals of the University of Craiova: Series Philology, Linguistics / Analele Universității Din Craiova: Seria Ştiințe Filologice, Linguistică, 39, 1 / 2: 154-170. Available online: https://www.ceeol.com/search/article-detail?id=599787.

Petrolito, Tommaso, Petrolito, Ruggero, Perono Cacciafoco, Francesco, and Winterstein, Grégoire. 2015. Minoan Linguistic Resources: The Linear A Digital Corpus. In Proceedings of the 9 th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTech) / ACL (Association for Computational Linguistics)-IJCNLP, July 26-31, 2015, Beijing, PRC (China National Convention Center - CNCC): 95-104. Available online: https://aclanthology.org/W15-3715/.

Robinson, Andrew. 2009. Lost Languages: The Enigma of the World’s Undeciphered Scripts. London: Thames & Hudson. [Print]

Salgarella, Ester. 2022. Linear A. In Hornblower, Simon, Spawforth, Antony, and Eidinow, Esther (Eds.). 2012 - . Oxford Classical Dictionary, s.v. Linear A. Oxford: Oxford University Press. Available online: https://oxfordre.com/classics/display/10.1093/acrefore/9780199381135.001.0001/acrefore-9780199381135-e-8927?rskey=PyJDe3&result=1.

 
Dr Perono Cacciafoco's picture

Dr Francesco

Dr Francesco PERONO CACCIAFOCO (Ph.D. University of Pisa, Pisa, Italy, 2011) is, currently, an Associate Professor in Linguistics at Xi'an Jiaotong-Liverpool University (XJTLU), School of Humanities and Social Sciences (HSS), Department of Applied Linguistics (LNG), Suzhou, China, where he teaches... Read More

Next article