What is this?

Language Explorer is a proof-of-concept data analysis tool to help identify language groups in particular need of bible translation based on various data sources and linguistic databases. This instance focuses on the languages of Aboriginal Australia - I love our Aboriginal people, and want to see them have access to the bible, preferably in their native tongue.

Where to start


There are navigation links and fields at the bottom of each page. I'll refer to those links collectively, in the sections below, as the navigation section. When the name and iso-639 language code is displayed in the tool, the colouring and formatting imply meaning in accordance with the following legend:


Translation State: Whole Bible New Testament Portions One book No Scripture Record Absent
L1 Speakers (from Joshua Project): None 0-9 10-99 100+ Unknown
ISO Retirement state: Retired Active

For a specific language group

The tool aims to have a page for each Aboriginal language group. If there are current speakers, the language group should be accessible in the tool. If there are no known speakers, the language group may be listed (many are, but I know that many are not, too). On the page for an individual group, data from the available sources is presented along with a map showing an approximate location for the speakers. In many cases the language group is clustered in a small area so the map is helpful but in some cases it is spread over a large area so the map can be misleading. Finally the page has links to the available data sources (including those whose data are not included in the tool). To look for a specific language group:

For Data Analysis

The tool provides a method to analyse important data for language groups to help assess the state of bible translation, and the ability of a language group to fallback to an English translation of scripture in the absence of the preferred situation of a translation in their heart language. To perform this sort of analysis follow the Language Table link in the navigation section. This shows a table of all language groups known to the tool, where each column can be filtered, and the table can be sorted by a single column. When using the table: Examples:

Data sources

Some more information is available in the DataSources.md file.

Disclaimer on data quality

As a proof-of-concept, the focus has been on the integration of discrete data sources, and while care has been taken to identify errors during the data import, it would be inappropriate to consider this as a tool to drive decision making without review of the aggregation process and scrutiny of the data itself. The purpose of this deployment is only casual review of the output of the aggregation engine and to show what is possible with this style of aggregation and analysis.


I have not attempted to licence the data that is imported into the database and subsequently displayed in the tool so if you are from an organisation whose data I am using, and I am violating your licence agreement by displaying it in this proof-of-concept instance, Please contact me.

This software itself is licenced under the terms of the Licence file in the Github repository.


Should you wish to deploy this yourself, it's not going to be a smooth process. It is repeatable for me, and there is some documentation in the DataSources.md and install.md documents in the "docs" directory. Contact me if you have any problems - I'm happy to help. All the source code is available in my GitHub repository.

-- Edwin Steele (edwin@wordspeak.org).