using Pkg
Pkg.activate(".")
Pkg.instantiate()
Activating project at `~/projects/research/ordinal_decorrelation/ordinal_decorrelation_website`
In my paper Global-scale phylogenetic linguistic inference from lexical resources, I describe a method to extract character matrices from ASJP data. On https://osf.io/a97sz/, I stored code and data from applying this workflow to version 19 of ASJP.
In this script, the character vectors for a predefined set of glottocodes are extracted from the data on the OSF repository, and mrbayes
scripts are created, one for each language family present among the collection of glottocodes.
Note that I load PyCall
and import the python
package ete3
, which is very convenient to manipulate phylogenies.
Activating project at `~/projects/research/ordinal_decorrelation/ordinal_decorrelation_website`
Row | soc_id | glottocode | political_complexity | hierarchy_within | domestic_organisation | agricultureLevel | settlement_strategy | exogamy | crop_type |
---|---|---|---|---|---|---|---|---|---|
String7 | String15 | String3 | String3 | String3 | String3 | String3 | String3 | String3 | |
1 | Aa1 | juho1239 | 1 | 3 | 3 | 0 | 1 | 1 | 1 |
2 | Aa2 | okie1245 | 1 | 2 | 1 | 0 | 1 | 1 | 1 |
3 | Aa3 | nama1265 | 2 | 2 | 1 | 0 | 1 | 1 | 1 |
4 | Aa4 | dama1270 | NA | NA | 2 | 0 | NA | 1 | NA |
5 | Aa5 | bila1255 | 1 | 2 | 1 | 0 | 1 | 1 | 1 |
6 | Aa6 | sand1273 | 2 | 2 | 1 | 5 | 3 | 1 | 6 |
7 | Aa7 | naro1249 | 1 | 2 | 1 | 0 | 1 | 1 | 1 |
8 | Aa8 | xamm1241 | 1 | 2 | 1 | 0 | 1 | NA | 1 |
9 | Aa9 | hadz1240 | 1 | 3 | 3 | 0 | 1 | 0 | 1 |
10 | Ab1 | here1253 | 1 | 3 | 3 | 0 | 1 | 1 | 1 |
11 | Ab10 | mpon1252 | 3 | 3 | 3 | 5 | 3 | 1 | 6 |
12 | Ab11 | xesi1238 | 4 | 3 | 2 | 5 | 3 | 0 | 5 |
13 | Ab12 | zulu1248 | 4 | 3 | 3 | 5 | 3 | 1 | 6 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
1280 | Si9 | awet1244 | NA | NA | NA | 4 | NA | NA | NA |
1281 | Sj1 | kara1500 | 1 | 3 | 3 | 2 | 2 | 0 | 5 |
1282 | Sj10 | mbya1239 | 1 | 2 | 1 | 5 | 4 | 0 | 5 |
1283 | Sj11 | xava1240 | 1 | 4 | 3 | 2 | 1 | 0 | 6 |
1284 | Sj2 | xere1240 | 1 | 3 | 1 | 4 | 4 | 0 | 5 |
1285 | Sj3 | xokl1240 | 1 | 2 | 2 | 0 | 1 | 0 | 1 |
1286 | Sj4 | cane1242 | 1 | 4 | 3 | 4 | 1 | 0 | 5 |
1287 | Sj5 | kren1239 | 1 | 3 | 4 | 0 | 1 | 1 | 1 |
1288 | Sj6 | temb1276 | 1 | 3 | 4 | 5 | 4 | 0 | 5 |
1289 | Sj7 | apin1244 | 1 | 3 | 4 | 5 | 4 | 0 | 5 |
1290 | Sj8 | tupi1273 | 2 | 3 | 4 | 4 | 4 | 0 | 5 |
1291 | Sj9 | kaya1330 | 1 | 3 | 1 | 6 | 1 | 0 | 5 |
Next I fetch metadata from the Glottolog website to assign a family to each doculect.
glottolog_cldf_zip = "../data/glottolog_cldf.zip"
isfile(glottolog_cldf_zip) || begin
download(
"https://zenodo.org/records/8131091/files/glottolog/glottolog-cldf-v4.8.zip?download=1",
glottolog_cldf_zip
)
run(`unzip $glottolog_cldf_zip -d ../data/`)
end
true
pth = "../data/glottolog-glottolog-cldf-59a612c/cldf/"
glottolog_languages = CSV.File(joinpath(pth, "languages.csv")) |> DataFrame
function glottocode_2_family(g, glottolog_languages)
# Find the row with the matching Glottocode
row = findfirst(==(g), glottolog_languages.Glottocode)
# If the Glottocode is not found, return nothing or an appropriate value
if isnothing(row)
return "Glottocode not found"
end
# Extract the family code
family_code = glottolog_languages.Family_ID[row]
# Check if the family code is missing
if ismissing(family_code)
return glottolog_languages.Name[row]
end
# Find the row for the family code
family_row = findfirst(==(family_code), glottolog_languages.Glottocode)
# If the family Glottocode is not found, return the name for the original Glottocode
if isnothing(family_row)
return glottolog_languages.Name[row]
end
# Return the name associated with the family Glottocode
return glottolog_languages.Name[family_row]
end
insertcols!(d, :Family => [glottocode_2_family(g, glottolog_languages) for g in d.glottocode])
Row | soc_id | glottocode | political_complexity | hierarchy_within | domestic_organisation | agricultureLevel | settlement_strategy | exogamy | crop_type | Family |
---|---|---|---|---|---|---|---|---|---|---|
String7 | String15 | String3 | String3 | String3 | String3 | String3 | String3 | String3 | String | |
1 | Aa1 | juho1239 | 1 | 3 | 3 | 0 | 1 | 1 | 1 | Kxa |
2 | Aa2 | okie1245 | 1 | 2 | 1 | 0 | 1 | 1 | 1 | Nilotic |
3 | Aa3 | nama1265 | 2 | 2 | 1 | 0 | 1 | 1 | 1 | Khoe-Kwadi |
4 | Aa4 | dama1270 | NA | NA | 2 | 0 | NA | 1 | NA | Khoe-Kwadi |
5 | Aa5 | bila1255 | 1 | 2 | 1 | 0 | 1 | 1 | 1 | Atlantic-Congo |
6 | Aa6 | sand1273 | 2 | 2 | 1 | 5 | 3 | 1 | 6 | Sandawe |
7 | Aa7 | naro1249 | 1 | 2 | 1 | 0 | 1 | 1 | 1 | Khoe-Kwadi |
8 | Aa8 | xamm1241 | 1 | 2 | 1 | 0 | 1 | NA | 1 | Tuu |
9 | Aa9 | hadz1240 | 1 | 3 | 3 | 0 | 1 | 0 | 1 | Hadza |
10 | Ab1 | here1253 | 1 | 3 | 3 | 0 | 1 | 1 | 1 | Atlantic-Congo |
11 | Ab10 | mpon1252 | 3 | 3 | 3 | 5 | 3 | 1 | 6 | Atlantic-Congo |
12 | Ab11 | xesi1238 | 4 | 3 | 2 | 5 | 3 | 0 | 5 | Atlantic-Congo |
13 | Ab12 | zulu1248 | 4 | 3 | 3 | 5 | 3 | 1 | 6 | Atlantic-Congo |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
1280 | Si9 | awet1244 | NA | NA | NA | 4 | NA | NA | NA | Tupian |
1281 | Sj1 | kara1500 | 1 | 3 | 3 | 2 | 2 | 0 | 5 | Nuclear-Macro-Je |
1282 | Sj10 | mbya1239 | 1 | 2 | 1 | 5 | 4 | 0 | 5 | Tupian |
1283 | Sj11 | xava1240 | 1 | 4 | 3 | 2 | 1 | 0 | 6 | Nuclear-Macro-Je |
1284 | Sj2 | xere1240 | 1 | 3 | 1 | 4 | 4 | 0 | 5 | Nuclear-Macro-Je |
1285 | Sj3 | xokl1240 | 1 | 2 | 2 | 0 | 1 | 0 | 1 | Nuclear-Macro-Je |
1286 | Sj4 | cane1242 | 1 | 4 | 3 | 4 | 1 | 0 | 5 | Nuclear-Macro-Je |
1287 | Sj5 | kren1239 | 1 | 3 | 4 | 0 | 1 | 1 | 1 | Nuclear-Macro-Je |
1288 | Sj6 | temb1276 | 1 | 3 | 4 | 5 | 4 | 0 | 5 | Tupian |
1289 | Sj7 | apin1244 | 1 | 3 | 4 | 5 | 4 | 0 | 5 | Nuclear-Macro-Je |
1290 | Sj8 | tupi1273 | 2 | 3 | 4 | 4 | 4 | 0 | 5 | Tupian |
1291 | Sj9 | kaya1330 | 1 | 3 | 1 | 6 | 1 | 0 | 5 | Nuclear-Macro-Je |
There are two types of characters in the OSF repo, cognate class characters and soundclass-concept characters. The two character matrices are downloaded and loaded in turn.
file_id = "h4a6z"
url = "https://api.osf.io/v2/files/$(file_id)/"
response = HTTP.get(url)
data = JSON.parse(String(response.body))
download_url = data["data"]["links"]["download"]
world_cc_ = DataFrame(
hcat(
split.(
split(read(download(download_url), String), "\n")[2:end]
)...
) |> permutedims, :auto)
rename!(world_cc_, :x1 => :longname, :x2 => :characters)
world_cc = @pipe world_cc_.characters |>
mapslices(x -> split.(x, ""), _, dims=1) |>
hcat(_...) |>
permutedims |>
DataFrame(_, :auto) |>
insertcols!(_, 1, :longname => world_cc_.longname)
Row | longname | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | x21 | x22 | x23 | x24 | x25 | x26 | x27 | x28 | x29 | x30 | x31 | x32 | x33 | x34 | x35 | x36 | x37 | x38 | x39 | x40 | x41 | x42 | x43 | x44 | x45 | x46 | x47 | x48 | x49 | x50 | x51 | x52 | x53 | x54 | x55 | x56 | x57 | x58 | x59 | x60 | x61 | x62 | x63 | x64 | x65 | x66 | x67 | x68 | x69 | x70 | x71 | x72 | x73 | x74 | x75 | x76 | x77 | x78 | x79 | x80 | x81 | x82 | x83 | x84 | x85 | x86 | x87 | x88 | x89 | x90 | x91 | x92 | x93 | x94 | x95 | x96 | x97 | x98 | x99 | ⋯ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | ⋯ | |
1 | AA.DIZOID.NAO | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
2 | AuA.KHASIAN.KHASI | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
3 | AuA.KHASIAN.KHASI_2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4 | AuA.KHASIAN.LYNGNGAM | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
5 | AuA.KHASIAN.PNAR_JOWAI | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
6 | AuA.KHASIAN.WAR_JAINTIA | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7 | Gun.GUNWINYGIC.BUAN | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
8 | Hok.YUMAN.YAVAPAI | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
9 | Iwa.IWAIDJAN.AMURDAK | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
10 | Iwa.IWAIDJAN.IWAIDJA | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
11 | LSR.GRASS.ABU | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
12 | NC.KWA.AJAGBE | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
13 | NC.NORTHERN_ATLANTIC.WOLOF_8 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
7421 | NC.BANTOID.NYANJA_NYASA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7422 | NC.KAINJI.KUKI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7423 | NC.KAINJI.REGI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7424 | NC.KAINJI.ROGO | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7425 | NC.KAINJI.SHAMA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7426 | NC.KWA.AKPAFU | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7427 | NC.PLATOID.BEROM_F | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7428 | AA.BERBER.CHAOUI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7429 | Man.WESTERN_MANDE.SEEKU | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7430 | An.CELEBIC.TOLAKI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7431 | AA.WEST_CHADIC.DERA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
7432 | NC.KAINJI.SEGEMUK | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
file_id = "3em9h"
url = "https://api.osf.io/v2/files/$(file_id)/"
response = HTTP.get(url)
data = JSON.parse(String(response.body))
download_url = data["data"]["links"]["download"]
world_sc_ = DataFrame(
hcat(
split.(
split(read(download(download_url), String), "\n")[2:end]
)...
) |> permutedims, :auto)
rename!(world_sc_, :x1 => :longname, :x2 => :characters)
world_sc = @pipe world_sc_.characters |>
mapslices(x -> split.(x, ""), _, dims=1) |>
hcat(_...) |>
permutedims |>
DataFrame(_, :auto) |>
insertcols!(_, 1, :longname => world_sc_.longname)
Row | longname | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | x21 | x22 | x23 | x24 | x25 | x26 | x27 | x28 | x29 | x30 | x31 | x32 | x33 | x34 | x35 | x36 | x37 | x38 | x39 | x40 | x41 | x42 | x43 | x44 | x45 | x46 | x47 | x48 | x49 | x50 | x51 | x52 | x53 | x54 | x55 | x56 | x57 | x58 | x59 | x60 | x61 | x62 | x63 | x64 | x65 | x66 | x67 | x68 | x69 | x70 | x71 | x72 | x73 | x74 | x75 | x76 | x77 | x78 | x79 | x80 | x81 | x82 | x83 | x84 | x85 | x86 | x87 | x88 | x89 | x90 | x91 | x92 | x93 | x94 | x95 | x96 | x97 | x98 | x99 | ⋯ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | ⋯ | |
1 | AA.DIZOID.NAO | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
2 | AuA.KHASIAN.KHASI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
3 | AuA.KHASIAN.KHASI_2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4 | AuA.KHASIAN.LYNGNGAM | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | ⋯ |
5 | AuA.KHASIAN.PNAR_JOWAI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | ⋯ |
6 | AuA.KHASIAN.WAR_JAINTIA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
7 | Gun.GUNWINYGIC.BUAN | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
8 | Hok.YUMAN.YAVAPAI | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
9 | Iwa.IWAIDJAN.AMURDAK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
10 | Iwa.IWAIDJAN.IWAIDJA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
11 | LSR.GRASS.ABU | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
12 | NC.KWA.AJAGBE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
13 | NC.NORTHERN_ATLANTIC.WOLOF_8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
7421 | NC.BANTOID.NYANJA_NYASA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
7422 | NC.KAINJI.KUKI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7423 | NC.KAINJI.REGI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7424 | NC.KAINJI.ROGO | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7425 | NC.KAINJI.SHAMA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7426 | NC.KWA.AKPAFU | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7427 | NC.PLATOID.BEROM_F | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7428 | AA.BERBER.CHAOUI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | ⋯ |
7429 | Man.WESTERN_MANDE.SEEKU | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7430 | An.CELEBIC.TOLAKI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7431 | AA.WEST_CHADIC.DERA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7432 | NC.KAINJI.SEGEMUK | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
Next I fetch and prepare the metadata for the ASJP doculect.
file_id = "w4jnf"
url = "https://api.osf.io/v2/files/$(file_id)/"
response = HTTP.get(url)
data = JSON.parse(String(response.body))
download_url = data["data"]["links"]["download"]
asjp_languages = @pipe CSV.read(
download(download_url),
missingstring="",
DataFrame) |>
dropmissing(_, :classification_wals) |>
dropmissing(_, :Glottocode) |>
filter(row -> row.recently_extinct == 0, _) |>
filter(row -> row.long_extinct == 0, _) |>
select(_, [:Name, :Glottocode, :Family, :classification_wals]) |>
DataFrames.transform(_, [:classification_wals, :Name] => ByRow((x, y) -> string(x, ".", y)) => :longname) |>
select(_, Not(:classification_wals)) |>
DataFrames.transform(_, :longname => ByRow(x -> replace(x, "-" => "_")) => :longname) |>
dropmissing
Row | Name | Glottocode | Family | longname |
---|---|---|---|---|
String | String15 | String31 | String | |
1 | A51_BAFIA_MAJA | lefa1242 | Atlantic-Congo | NC.BANTOID.A51_BAFIA_MAJA |
2 | A51_BAFIA_TUMI_TINGON | lefa1242 | Atlantic-Congo | NC.BANTOID.A51_BAFIA_TUMI_TINGON |
3 | A51_BAFIA_ZAKAAN | lefa1242 | Atlantic-Congo | NC.BANTOID.A51_BAFIA_ZAKAAN |
4 | A53_BAFIA_RIKPA | bafi1243 | Atlantic-Congo | NC.BANTOID.A53_BAFIA_RIKPA |
5 | A54_BAFIA_NJANTI | tibe1274 | Atlantic-Congo | NC.BANTOID.A54_BAFIA_NJANTI |
6 | A60_GUNU | nugu1242 | Atlantic-Congo | NC.BANTOID.A60_GUNU |
7 | A60_MMAALA | mmaa1238 | Atlantic-Congo | NC.BANTOID.A60_MMAALA |
8 | A61_NGORO_ASOM | tuki1240 | Atlantic-Congo | NC.BANTOID.A61_NGORO_ASOM |
9 | A62_KALONGE | yang1293 | Atlantic-Congo | NC.BANTOID.A62_KALONGE |
10 | A72a_EWONDO | ewon1239 | Atlantic-Congo | NC.BANTOID.A72a_EWONDO |
11 | AASAX | aasa1238 | Afro-Asiatic | AA.SOUTHERN_CUSHITIC.AASAX |
12 | ABAGA | abag1245 | Nuclear Trans New Guinea | TNG.EASTERN_HIGHLANDS.ABAGA |
13 | ABANYOM | aban1242 | Atlantic-Congo | NC.BANTOID.ABANYOM |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
8942 | ZOOMBO_3 | koon1244 | Atlantic-Congo | NC.BANTOID.ZOOMBO_3 |
8943 | ZOOMBO_4 | koon1244 | Atlantic-Congo | NC.BANTOID.ZOOMBO_4 |
8944 | ZOQUE_FRANCISCO_LEON | fran1266 | Mixe-Zoque | MZ.MIXE_ZOQUE.ZOQUE_FRANCISCO_LEON |
8945 | ZOQUE_RAYON | rayo1235 | Mixe-Zoque | MZ.MIXE_ZOQUE.ZOQUE_RAYON |
8946 | ZUGUNUK_KALASHA | kala1372 | Indo-European | IE.INDIC.ZUGUNUK_KALASHA |
8947 | ZULGO | zulg1242 | Afro-Asiatic | AA.BIU_MANDARA.ZULGO |
8948 | ZULU | zulu1248 | Atlantic-Congo | NC.BANTOID.ZULU |
8949 | ZULU_2 | zulu1248 | Atlantic-Congo | NC.BANTOID.ZULU_2 |
8950 | ZULU_NKANDLA | zulu1248 | Atlantic-Congo | NC.BANTOID.ZULU_NKANDLA |
8951 | ZUMBUN | zumb1240 | Afro-Asiatic | AA.WEST_CHADIC.ZUMBUN |
8952 | ZUNI | zuni1245 | Zuni | Zun.ZUNI.ZUNI |
8953 | ZWAY | zayy1238 | Afro-Asiatic | AA.SEMITIC.ZWAY |
I developed my own naming convention for ASJP doculects – [WALS family name].[WALS genus_name].[doculect name]. These must be matched with glottocodes.
longname2glottocode = Dict{String, String}(
zip(asjp_languages.longname, asjp_languages.Glottocode)
)
glottocode2longname = Dict{String, String}(
zip(asjp_languages.Glottocode, asjp_languages.longname)
)
glottocode2family = Dict{String, String}(
zip(asjp_languages.Glottocode, asjp_languages.Family)
)
for l in d.glottocode
if l ∉ keys(glottocode2longname)
longname2glottocode[l] = l
glottocode2longname[l] = l
end
end
Restricting the character vectors to the doculects for which I have a glottocode.
filter!(row -> row.longname ∈ asjp_languages.longname, world_cc)
filter!(row -> row.longname ∈ asjp_languages.longname, world_sc)
Row | longname | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | x21 | x22 | x23 | x24 | x25 | x26 | x27 | x28 | x29 | x30 | x31 | x32 | x33 | x34 | x35 | x36 | x37 | x38 | x39 | x40 | x41 | x42 | x43 | x44 | x45 | x46 | x47 | x48 | x49 | x50 | x51 | x52 | x53 | x54 | x55 | x56 | x57 | x58 | x59 | x60 | x61 | x62 | x63 | x64 | x65 | x66 | x67 | x68 | x69 | x70 | x71 | x72 | x73 | x74 | x75 | x76 | x77 | x78 | x79 | x80 | x81 | x82 | x83 | x84 | x85 | x86 | x87 | x88 | x89 | x90 | x91 | x92 | x93 | x94 | x95 | x96 | x97 | x98 | x99 | ⋯ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | ⋯ | |
1 | AA.DIZOID.NAO | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
2 | AuA.KHASIAN.KHASI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
3 | AuA.KHASIAN.KHASI_2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4 | AuA.KHASIAN.LYNGNGAM | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | ⋯ |
5 | AuA.KHASIAN.PNAR_JOWAI | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | ⋯ |
6 | AuA.KHASIAN.WAR_JAINTIA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
7 | Gun.GUNWINYGIC.BUAN | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
8 | Hok.YUMAN.YAVAPAI | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
9 | Iwa.IWAIDJAN.AMURDAK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
10 | Iwa.IWAIDJAN.IWAIDJA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
11 | LSR.GRASS.ABU | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
12 | NC.KWA.AJAGBE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
13 | NC.NORTHERN_ATLANTIC.WOLOF_8 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
7023 | NC.BANTOID.NYANJA_NYASA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
7024 | NC.KAINJI.KUKI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7025 | NC.KAINJI.REGI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7026 | NC.KAINJI.ROGO | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7027 | NC.KAINJI.SHAMA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7028 | NC.KWA.AKPAFU | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7029 | NC.PLATOID.BEROM_F | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7030 | AA.BERBER.CHAOUI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | ⋯ |
7031 | Man.WESTERN_MANDE.SEEKU | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7032 | An.CELEBIC.TOLAKI | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7033 | AA.WEST_CHADIC.DERA | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7034 | NC.KAINJI.SEGEMUK | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
ASJP sometimes contains several doculects for the same glottocode. Therefore I now compute the number of missing entries for each doculect. For each glottocode, I select the ASJP doculect with fewest missing entries as representative.
insertcols!(
world_cc,
1,
:Glottocode => [longname2glottocode[x] for x in world_cc.longname]
)
insertcols!(
world_sc,
1,
:Glottocode => [longname2glottocode[x] for x in world_sc.longname]
)
best_languages = @pipe world_sc |>
DataFrame(
longname = _.longname,
Glottocode = _.Glottocode,
nGaps = map(x -> sum(Array(x) .== "-"), eachrow(_))
) |>
sort(_, :nGaps) |>
unique(_, :Glottocode).longname
4261-element Vector{SubString{String}}:
"AuA.KHASIAN.KHASI"
"Hok.YUMAN.YAVAPAI"
"Iwa.IWAIDJAN.IWAIDJA"
"ST.BODIC.BUNAN"
"ST.BODIC.EASTERN_BALTI"
"ST.BODIC.GHACHOK"
"ST.BODIC.HELAMBU_SHERPA"
"ST.BODIC.KAGATE"
"ST.BODIC.LHASA_TIBETAN"
"ST.BODIC.LOWA"
"ST.BODIC.MANANGE"
"ST.BODIC.PATTANI"
"ST.BODIC.PURIK"
⋮
"Hok.YUMAN.MARICOPA"
"NC.BANTOID.FANG"
"NC.BANTOID.NJEN"
"TNG.BINANDEREAN.GAINA"
"NDe.ATHAPASKAN.HAN"
"TNG.BINANDEREAN.OROKAIVA_SOSE"
"ESu.NILOTIC.SOGOO"
"NC.BANTOID.KOSHIN"
"CSu.BONGO_BAGIRMI.GULA_SARA"
"An.GREATER_CENTRAL_PHILIPPINE.MANDAYAN_ISLAM_PISO"
"An.OCEANIC.PENRHYN"
"AA.BIU_MANDARA.VEMGO_MABAS_2"
The character matrices are now restricted to the doculects representing a glottocode.
filter!(row -> row.longname ∈ best_languages, world_cc)
filter!(row -> row.longname ∈ best_languages, world_sc)
select!(world_cc, Not(:longname))
select!(world_sc, Not(:longname))
Row | Glottocode | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | x21 | x22 | x23 | x24 | x25 | x26 | x27 | x28 | x29 | x30 | x31 | x32 | x33 | x34 | x35 | x36 | x37 | x38 | x39 | x40 | x41 | x42 | x43 | x44 | x45 | x46 | x47 | x48 | x49 | x50 | x51 | x52 | x53 | x54 | x55 | x56 | x57 | x58 | x59 | x60 | x61 | x62 | x63 | x64 | x65 | x66 | x67 | x68 | x69 | x70 | x71 | x72 | x73 | x74 | x75 | x76 | x77 | x78 | x79 | x80 | x81 | x82 | x83 | x84 | x85 | x86 | x87 | x88 | x89 | x90 | x91 | x92 | x93 | x94 | x95 | x96 | x97 | x98 | x99 | ⋯ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
String | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | ⋯ | |
1 | nayi1243 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
2 | khas1269 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
3 | lyng1241 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | ⋯ |
4 | pnar1238 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | ⋯ |
5 | warj1242 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
6 | ngal1292 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7 | hava1248 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
8 | amar1271 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
9 | iwai1244 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
10 | abuu1241 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
11 | ajab1235 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
12 | nucl1347 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
13 | amah1246 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
4250 | rapa1244 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | ⋯ |
4251 | toro1253 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
4252 | amas1236 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | ⋯ |
4253 | lagw1237 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
4254 | vemg1240 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | ⋯ |
4255 | rogo1238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4256 | sham1278 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4257 | siwu1238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4258 | tach1249 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | ⋯ |
4259 | seek1238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4260 | dera1248 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4261 | east2403 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
For the glottocodes in the target set for which there are no ASJP data, a character vector consisting of missing entries is constructed. Then, the character matrices are restricted to the glottocodes from the target set.
for l in setdiff(d.glottocode, world_sc.Glottocode)
nl_cc = repeat(["-"], size(world_cc, 2))
nl_cc[1] = l
push!(world_cc, nl_cc)
nl_sc = repeat(["-"], size(world_sc, 2))
nl_sc[1] = l
push!(world_sc, nl_sc)
end
filter!(row -> row.Glottocode ∈ d.glottocode, world_cc)
filter!(row -> row.Glottocode ∈ d.glottocode, world_sc)
Row | Glottocode | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | x21 | x22 | x23 | x24 | x25 | x26 | x27 | x28 | x29 | x30 | x31 | x32 | x33 | x34 | x35 | x36 | x37 | x38 | x39 | x40 | x41 | x42 | x43 | x44 | x45 | x46 | x47 | x48 | x49 | x50 | x51 | x52 | x53 | x54 | x55 | x56 | x57 | x58 | x59 | x60 | x61 | x62 | x63 | x64 | x65 | x66 | x67 | x68 | x69 | x70 | x71 | x72 | x73 | x74 | x75 | x76 | x77 | x78 | x79 | x80 | x81 | x82 | x83 | x84 | x85 | x86 | x87 | x88 | x89 | x90 | x91 | x92 | x93 | x94 | x95 | x96 | x97 | x98 | x99 | ⋯ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
String | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | ⋯ | |
1 | khas1269 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
2 | nucl1347 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
3 | amah1246 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
4 | sher1255 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
5 | akha1245 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
6 | sich1238 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7 | nucl1310 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
8 | kach1280 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | ⋯ |
9 | karb1241 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
10 | loth1237 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
11 | lepc1244 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
12 | west2418 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
13 | bori1243 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
1199 | piar1243 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1200 | chib1270 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1201 | yine1238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1202 | uruu1244 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1203 | onaa1245 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1204 | tehu1242 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1205 | abip1241 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1206 | trum1247 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1207 | umot1240 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1208 | awet1244 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1209 | tupi1273 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1210 | kaya1330 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
In the cognate class character matrix, all columns containing no 1
are removed.
Row | Glottocode | x1 | x2 | x4 | x5 | x9 | x10 | x11 | x13 | x14 | x20 | x22 | x24 | x29 | x32 | x35 | x44 | x45 | x47 | x48 | x49 | x52 | x54 | x55 | x56 | x57 | x59 | x60 | x63 | x64 | x65 | x66 | x68 | x69 | x71 | x79 | x80 | x82 | x84 | x85 | x90 | x97 | x101 | x105 | x107 | x108 | x109 | x110 | x113 | x118 | x121 | x124 | x127 | x129 | x134 | x137 | x140 | x143 | x144 | x147 | x148 | x149 | x150 | x153 | x154 | x156 | x157 | x158 | x159 | x161 | x163 | x165 | x166 | x167 | x168 | x170 | x174 | x175 | x178 | x179 | x181 | x182 | x183 | x187 | x190 | x195 | x196 | x197 | x201 | x202 | x203 | x205 | x208 | x212 | x217 | x218 | x220 | x221 | x222 | x223 | ⋯ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
String | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | ⋯ | |
1 | khas1269 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
2 | nucl1347 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
3 | amah1246 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
4 | sher1255 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
5 | akha1245 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
6 | sich1238 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7 | nucl1310 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
8 | kach1280 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
9 | karb1241 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
10 | loth1237 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
11 | lepc1244 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
12 | west2418 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
13 | bori1243 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
1199 | piar1243 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1200 | chib1270 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1201 | yine1238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1202 | uruu1244 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1203 | onaa1245 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1204 | tehu1242 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1205 | abip1241 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1206 | trum1247 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1207 | umot1240 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1208 | awet1244 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1209 | tupi1273 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1210 | kaya1330 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
Now I add the Glottolog family names to d
.
1291-element Vector{String}:
"Kxa"
"Nilotic"
"Khoe_Kwadi"
"Khoe_Kwadi"
"Atlantic_Congo"
"Sandawe"
"Khoe_Kwadi"
"Tuu"
"Hadza"
"Atlantic_Congo"
"Atlantic_Congo"
"Atlantic_Congo"
"Atlantic_Congo"
⋮
"Tupian"
"Nuclear_Macro_Je"
"Tupian"
"Nuclear_Macro_Je"
"Nuclear_Macro_Je"
"Nuclear_Macro_Je"
"Nuclear_Macro_Je"
"Nuclear_Macro_Je"
"Tupian"
"Nuclear_Macro_Je"
"Tupian"
"Nuclear_Macro_Je"
Here is a helper function that takes a DataFrame
object representing a character matrix and constructs the content of a Nexus
file representing that matrix.
If a family only contains two taxa, a dummy taxa is added which has all characters missing. This is required because MrBayes
only works with datasets containing at least 3 taxa.
# create character matrices
function df2nexus(cm)
pad = maximum(length.(cm.Glottocode)) + 5
ntaxa = size(cm, 1) == 2 ? 3 : size(cm, 1)
nex = """#Nexus
BEGIN DATA;
DIMENSIONS ntax=$ntaxa nchar = $(size(cm, 2)-1);
FORMAT DATATYPE=Restriction GAP=? MISSING=- interleave=no;
MATRIX
"""
for i in axes(cm, 1)
nex *= rpad(cm.Glottocode[i], pad) * join(Array(cm[i, 2:end])) * "\n"
end
if nrow(cm) == 2
nex *= rpad("dummy", pad) * repeat("?", size(cm, 2)-1) * "\n"
end
nex *= ";\nEND"
nex
end
df2nexus (generic function with 1 method)
concatenating the two character matrices…
Row | Glottocode | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 | x11 | x12 | x13 | x14 | x15 | x16 | x17 | x18 | x19 | x20 | x21 | x22 | x23 | x24 | x25 | x26 | x27 | x28 | x29 | x30 | x31 | x32 | x33 | x34 | x35 | x36 | x37 | x38 | x39 | x40 | x41 | x42 | x43 | x44 | x45 | x46 | x47 | x48 | x49 | x50 | x51 | x52 | x53 | x54 | x55 | x56 | x57 | x58 | x59 | x60 | x61 | x62 | x63 | x64 | x65 | x66 | x67 | x68 | x69 | x70 | x71 | x72 | x73 | x74 | x75 | x76 | x77 | x78 | x79 | x80 | x81 | x82 | x83 | x84 | x85 | x86 | x87 | x88 | x89 | x90 | x91 | x92 | x93 | x94 | x95 | x96 | x97 | x98 | x99 | ⋯ |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
String | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | SubStrin… | ⋯ | |
1 | khas1269 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
2 | nucl1347 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
3 | amah1246 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
4 | sher1255 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
5 | akha1245 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | ⋯ |
6 | sich1238 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
7 | nucl1310 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
8 | kach1280 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | ⋯ |
9 | karb1241 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
10 | loth1237 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
11 | lepc1244 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
12 | west2418 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ⋯ |
13 | bori1243 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
1199 | piar1243 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1200 | chib1270 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1201 | yine1238 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1202 | uruu1244 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1203 | onaa1245 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1204 | tehu1242 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1205 | abip1241 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1206 | trum1247 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1207 | umot1240 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1208 | awet1244 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1209 | tupi1273 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
1210 | kaya1330 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | ⋯ |
lineages = @pipe d |>
unique(_, :glottocode) |>
groupby(_, :Family) |>
combine(_, nrow) |>
sort(_,:nrow)
families = lineages.Family[lineages.nrow .> 1]
73-element Vector{String}:
"Kadugli_Krongo"
"Songhay"
"Basque"
"Abkhaz_Adyge"
"Hmong_Mien"
"Tai_Kadai"
"Ndu"
"Koiarian"
"Greater_Kwerba"
"Chinookan"
"Palaihnihan"
"Maiduan"
"Yuki_Wappo"
⋮
"Uralic"
"Salishan"
"Mande"
"Athabaskan_Eyak_Tlingit"
"Uto_Aztecan"
"Sino_Tibetan"
"Algic"
"Nilotic"
"Indo_European"
"Afro_Asiatic"
"Austronesian"
"Atlantic_Congo"
Now I fetch the Glottolog classification as a vector of newick strings from the Glottolog website.
glottologF = "../data/tree_glottolog_newick.txt"
isfile(glottologF) || download(
"https://cdstar.eva.mpg.de//bitstreams/EAEA0-B701-6328-C3E3-0/tree_glottolog_newick.txt",
glottologF
)
true
First some clean-up to make the newick strings digestible by ete3
. Then, the newick tree for each family is read in as ete3
tree object.
Next, the Glottolog for the individual families are combined to a Glottolog world tree.
This Glottolog tree contains internal nodes representing glottocodes. They may have daughter nodes representing dialects. To make sure that each glottocode is a leaf, I create another leaf daughter for each named internal node and shift the namer of the internal node to that new leaf.
glot = ete3.Tree()
for t in trees
glot.add_child(t)
end
nonLeaves = [nd.name for nd in glot.traverse()
if (nd.name != "") & !nd.is_leaf()
]
@showprogress for nm in nonLeaves
nd = (glot & nm)
nd.name = ""
nd.add_child(name=nm)
end
Progress: 0%| | ETA: 1:27:33Progress: 25%|██████████▎ | ETA: 0:00:15Progress: 26%|██████████▋ | ETA: 0:00:15Progress: 26%|██████████▊ | ETA: 0:00:15Progress: 26%|██████████▉ | ETA: 0:00:15Progress: 27%|███████████ | ETA: 0:00:15Progress: 27%|███████████▏ | ETA: 0:00:16Progress: 27%|███████████▎ | ETA: 0:00:16Progress: 28%|███████████▍ | ETA: 0:00:16Progress: 28%|███████████▌ | ETA: 0:00:16Progress: 28%|███████████▋ | ETA: 0:00:16Progress: 28%|███████████▋ | ETA: 0:00:16Progress: 29%|███████████▊ | ETA: 0:00:16Progress: 29%|███████████▉ | ETA: 0:00:16Progress: 29%|████████████ | ETA: 0:00:16Progress: 30%|████████████▏ | ETA: 0:00:16Progress: 30%|████████████▎ | ETA: 0:00:16Progress: 30%|████████████▍ | ETA: 0:00:16Progress: 30%|████████████▌ | ETA: 0:00:16Progress: 31%|████████████▌ | ETA: 0:00:16Progress: 31%|████████████▋ | ETA: 0:00:16Progress: 31%|████████████▊ | ETA: 0:00:16Progress: 31%|████████████▉ | ETA: 0:00:16Progress: 32%|█████████████ | ETA: 0:00:16Progress: 32%|█████████████ | ETA: 0:00:16Progress: 32%|█████████████▏ | ETA: 0:00:16Progress: 32%|█████████████▎ | ETA: 0:00:16Progress: 33%|█████████████▍ | ETA: 0:00:16Progress: 33%|█████████████▌ | ETA: 0:00:16Progress: 33%|█████████████▌ | ETA: 0:00:17Progress: 33%|█████████████▋ | ETA: 0:00:17Progress: 33%|█████████████▊ | ETA: 0:00:17Progress: 34%|█████████████▉ | ETA: 0:00:17Progress: 34%|█████████████▉ | ETA: 0:00:17Progress: 34%|██████████████ | ETA: 0:00:17Progress: 34%|██████████████▏ | ETA: 0:00:17Progress: 35%|██████████████▏ | ETA: 0:00:17Progress: 35%|██████████████▎ | ETA: 0:00:17Progress: 35%|██████████████▍ | ETA: 0:00:17Progress: 35%|██████████████▌ | ETA: 0:00:17Progress: 35%|██████████████▌ | ETA: 0:00:17Progress: 36%|██████████████▋ | ETA: 0:00:17Progress: 36%|██████████████▊ | ETA: 0:00:17Progress: 36%|██████████████▊ | ETA: 0:00:17Progress: 36%|██████████████▉ | ETA: 0:00:17Progress: 36%|███████████████ | ETA: 0:00:17Progress: 37%|███████████████ | ETA: 0:00:17Progress: 37%|███████████████▏ | ETA: 0:00:17Progress: 37%|███████████████▎ | ETA: 0:00:17Progress: 37%|███████████████▎ | ETA: 0:00:17Progress: 37%|███████████████▍ | ETA: 0:00:17Progress: 38%|███████████████▌ | ETA: 0:00:17Progress: 38%|███████████████▌ | ETA: 0:00:17Progress: 38%|███████████████▋ | ETA: 0:00:17Progress: 38%|███████████████▊ | ETA: 0:00:17Progress: 38%|███████████████▊ | ETA: 0:00:17Progress: 39%|███████████████▉ | ETA: 0:00:17Progress: 39%|███████████████▉ | ETA: 0:00:17Progress: 39%|████████████████ | ETA: 0:00:17Progress: 39%|████████████████▏ | ETA: 0:00:17Progress: 39%|████████████████▏ | ETA: 0:00:17Progress: 40%|████████████████▎ | ETA: 0:00:17Progress: 40%|████████████████▍ | ETA: 0:00:17Progress: 40%|████████████████▍ | ETA: 0:00:17Progress: 40%|████████████████▌ | ETA: 0:00:18Progress: 40%|████████████████▌ | ETA: 0:00:18Progress: 41%|████████████████▋ | ETA: 0:00:18Progress: 41%|████████████████▊ | ETA: 0:00:18Progress: 41%|████████████████▊ | ETA: 0:00:18Progress: 41%|████████████████▉ | ETA: 0:00:18Progress: 41%|████████████████▉ | ETA: 0:00:18Progress: 41%|█████████████████ | ETA: 0:00:18Progress: 42%|█████████████████ | ETA: 0:00:18Progress: 42%|█████████████████▏ | ETA: 0:00:18Progress: 42%|█████████████████▎ | ETA: 0:00:18Progress: 42%|█████████████████▎ | ETA: 0:00:18Progress: 42%|█████████████████▍ | ETA: 0:00:18Progress: 42%|█████████████████▍ | ETA: 0:00:18Progress: 43%|█████████████████▌ | ETA: 0:00:18Progress: 43%|█████████████████▌ | ETA: 0:00:18Progress: 43%|█████████████████▋ | ETA: 0:00:18Progress: 43%|█████████████████▊ | ETA: 0:00:18Progress: 43%|█████████████████▊ | ETA: 0:00:18Progress: 43%|█████████████████▉ | ETA: 0:00:18Progress: 44%|█████████████████▉ | ETA: 0:00:18Progress: 44%|██████████████████ | ETA: 0:00:18Progress: 44%|██████████████████ | ETA: 0:00:18Progress: 44%|██████████████████▏ | ETA: 0:00:18Progress: 44%|██████████████████▏ | ETA: 0:00:18Progress: 44%|██████████████████▎ | ETA: 0:00:18Progress: 45%|██████████████████▎ | ETA: 0:00:18Progress: 45%|██████████████████▍ | ETA: 0:00:18Progress: 45%|██████████████████▍ | ETA: 0:00:18Progress: 45%|██████████████████▌ | ETA: 0:00:18Progress: 45%|██████████████████▌ | ETA: 0:00:18Progress: 45%|██████████████████▋ | ETA: 0:00:18Progress: 46%|██████████████████▋ | ETA: 0:00:18Progress: 46%|██████████████████▊ | ETA: 0:00:18Progress: 46%|██████████████████▊ | ETA: 0:00:18Progress: 46%|██████████████████▉ | ETA: 0:00:18Progress: 46%|██████████████████▉ | ETA: 0:00:18Progress: 46%|███████████████████ | ETA: 0:00:18Progress: 46%|███████████████████ | ETA: 0:00:18Progress: 47%|███████████████████▏ | ETA: 0:00:18Progress: 47%|███████████████████▏ | ETA: 0:00:18Progress: 47%|███████████████████▎ | ETA: 0:00:18Progress: 47%|███████████████████▎ | ETA: 0:00:18Progress: 47%|███████████████████▍ | ETA: 0:00:18Progress: 47%|███████████████████▍ | ETA: 0:00:18Progress: 47%|███████████████████▌ | ETA: 0:00:18Progress: 48%|███████████████████▌ | ETA: 0:00:18Progress: 48%|███████████████████▋ | ETA: 0:00:18Progress: 48%|███████████████████▋ | ETA: 0:00:18Progress: 48%|███████████████████▊ | ETA: 0:00:18Progress: 48%|███████████████████▊ | ETA: 0:00:18Progress: 48%|███████████████████▉ | ETA: 0:00:18Progress: 49%|███████████████████▉ | ETA: 0:00:18Progress: 49%|████████████████████ | ETA: 0:00:18Progress: 49%|████████████████████ | ETA: 0:00:18Progress: 49%|████████████████████ | ETA: 0:00:18Progress: 49%|████████████████████▏ | ETA: 0:00:18Progress: 49%|████████████████████▏ | ETA: 0:00:18Progress: 49%|████████████████████▎ | ETA: 0:00:18Progress: 49%|████████████████████▎ | ETA: 0:00:18Progress: 50%|████████████████████▍ | ETA: 0:00:18Progress: 50%|████████████████████▍ | ETA: 0:00:18Progress: 50%|████████████████████▌ | ETA: 0:00:18Progress: 50%|████████████████████▌ | ETA: 0:00:18Progress: 50%|████████████████████▋ | ETA: 0:00:18Progress: 50%|████████████████████▋ | ETA: 0:00:18Progress: 50%|████████████████████▊ | ETA: 0:00:18Progress: 51%|████████████████████▊ | ETA: 0:00:18Progress: 51%|████████████████████▊ | ETA: 0:00:18Progress: 51%|████████████████████▉ | ETA: 0:00:18Progress: 51%|████████████████████▉ | ETA: 0:00:18Progress: 51%|█████████████████████ | ETA: 0:00:18Progress: 51%|█████████████████████ | ETA: 0:00:18Progress: 51%|█████████████████████▏ | ETA: 0:00:18Progress: 52%|█████████████████████▏ | ETA: 0:00:18Progress: 52%|█████████████████████▏ | ETA: 0:00:18Progress: 52%|█████████████████████▎ | ETA: 0:00:18Progress: 52%|█████████████████████▎ | ETA: 0:00:18Progress: 52%|█████████████████████▍ | ETA: 0:00:18Progress: 52%|█████████████████████▍ | ETA: 0:00:18Progress: 52%|█████████████████████▌ | ETA: 0:00:18Progress: 52%|█████████████████████▌ | ETA: 0:00:18Progress: 53%|█████████████████████▌ | ETA: 0:00:18Progress: 53%|█████████████████████▋ | ETA: 0:00:18Progress: 53%|█████████████████████▋ | ETA: 0:00:18Progress: 53%|█████████████████████▊ | ETA: 0:00:18Progress: 53%|█████████████████████▊ | ETA: 0:00:18Progress: 53%|█████████████████████▉ | ETA: 0:00:18Progress: 53%|█████████████████████▉ | ETA: 0:00:18Progress: 53%|█████████████████████▉ | ETA: 0:00:18Progress: 54%|██████████████████████ | ETA: 0:00:18Progress: 54%|██████████████████████ | ETA: 0:00:18Progress: 54%|██████████████████████▏ | ETA: 0:00:18Progress: 54%|██████████████████████▏ | ETA: 0:00:18Progress: 54%|██████████████████████▏ | ETA: 0:00:18Progress: 54%|██████████████████████▎ | ETA: 0:00:18Progress: 54%|██████████████████████▎ | ETA: 0:00:18Progress: 54%|██████████████████████▍ | ETA: 0:00:18Progress: 55%|██████████████████████▍ | ETA: 0:00:18Progress: 55%|██████████████████████▌ | ETA: 0:00:18Progress: 55%|██████████████████████▌ | ETA: 0:00:18Progress: 55%|██████████████████████▌ | ETA: 0:00:18Progress: 55%|██████████████████████▋ | ETA: 0:00:18Progress: 55%|██████████████████████▋ | ETA: 0:00:18Progress: 55%|██████████████████████▊ | ETA: 0:00:18Progress: 55%|██████████████████████▊ | ETA: 0:00:18Progress: 56%|██████████████████████▊ | ETA: 0:00:18Progress: 56%|██████████████████████▉ | ETA: 0:00:18Progress: 56%|██████████████████████▉ | ETA: 0:00:18Progress: 56%|███████████████████████ | ETA: 0:00:18Progress: 56%|███████████████████████ | ETA: 0:00:18Progress: 56%|███████████████████████ | ETA: 0:00:18Progress: 56%|███████████████████████▏ | ETA: 0:00:18Progress: 56%|███████████████████████▏ | ETA: 0:00:18Progress: 57%|███████████████████████▏ | ETA: 0:00:18Progress: 57%|███████████████████████▎ | ETA: 0:00:18Progress: 57%|███████████████████████▎ | ETA: 0:00:18Progress: 57%|███████████████████████▍ | ETA: 0:00:18Progress: 57%|███████████████████████▍ | ETA: 0:00:18Progress: 57%|███████████████████████▍ | ETA: 0:00:18Progress: 57%|███████████████████████▌ | ETA: 0:00:18Progress: 57%|███████████████████████▌ | ETA: 0:00:18Progress: 57%|███████████████████████▋ | ETA: 0:00:18Progress: 58%|███████████████████████▋ | ETA: 0:00:18Progress: 58%|███████████████████████▋ | ETA: 0:00:18Progress: 58%|███████████████████████▊ | ETA: 0:00:18Progress: 58%|███████████████████████▊ | ETA: 0:00:18Progress: 58%|███████████████████████▊ | ETA: 0:00:18Progress: 58%|███████████████████████▉ | ETA: 0:00:18Progress: 58%|███████████████████████▉ | ETA: 0:00:18Progress: 58%|████████████████████████ | ETA: 0:00:18Progress: 59%|████████████████████████ | ETA: 0:00:18Progress: 59%|████████████████████████ | ETA: 0:00:18Progress: 59%|████████████████████████▏ | ETA: 0:00:18Progress: 59%|████████████████████████▏ | ETA: 0:00:18Progress: 59%|████████████████████████▏ | ETA: 0:00:18Progress: 59%|████████████████████████▎ | ETA: 0:00:18Progress: 59%|████████████████████████▎ | ETA: 0:00:18Progress: 59%|████████████████████████▍ | ETA: 0:00:18Progress: 59%|████████████████████████▍ | ETA: 0:00:18Progress: 60%|████████████████████████▍ | ETA: 0:00:18Progress: 60%|████████████████████████▌ | ETA: 0:00:18Progress: 60%|████████████████████████▌ | ETA: 0:00:18Progress: 60%|████████████████████████▌ | ETA: 0:00:18Progress: 60%|████████████████████████▋ | ETA: 0:00:18Progress: 60%|████████████████████████▋ | ETA: 0:00:18Progress: 60%|████████████████████████▋ | ETA: 0:00:18Progress: 60%|████████████████████████▊ | ETA: 0:00:18Progress: 60%|████████████████████████▊ | ETA: 0:00:18Progress: 61%|████████████████████████▊ | ETA: 0:00:18Progress: 61%|████████████████████████▉ | ETA: 0:00:18Progress: 61%|████████████████████████▉ | ETA: 0:00:18Progress: 61%|█████████████████████████ | ETA: 0:00:18Progress: 61%|█████████████████████████ | ETA: 0:00:18Progress: 61%|█████████████████████████ | ETA: 0:00:18Progress: 61%|█████████████████████████▏ | ETA: 0:00:18Progress: 61%|█████████████████████████▏ | ETA: 0:00:18Progress: 61%|█████████████████████████▏ | ETA: 0:00:18Progress: 61%|█████████████████████████▎ | ETA: 0:00:18Progress: 62%|█████████████████████████▎ | ETA: 0:00:18Progress: 62%|█████████████████████████▎ | ETA: 0:00:18Progress: 62%|█████████████████████████▍ | ETA: 0:00:18Progress: 62%|█████████████████████████▍ | ETA: 0:00:18Progress: 62%|█████████████████████████▍ | ETA: 0:00:18Progress: 62%|█████████████████████████▌ | ETA: 0:00:18Progress: 62%|█████████████████████████▌ | ETA: 0:00:18Progress: 62%|█████████████████████████▌ | ETA: 0:00:18Progress: 62%|█████████████████████████▋ | ETA: 0:00:18Progress: 63%|█████████████████████████▋ | ETA: 0:00:18Progress: 63%|█████████████████████████▋ | ETA: 0:00:18Progress: 63%|█████████████████████████▊ | ETA: 0:00:18Progress: 63%|█████████████████████████▊ | ETA: 0:00:18Progress: 63%|█████████████████████████▊ | ETA: 0:00:18Progress: 63%|█████████████████████████▉ | ETA: 0:00:18Progress: 63%|█████████████████████████▉ | ETA: 0:00:18Progress: 63%|█████████████████████████▉ | ETA: 0:00:18Progress: 63%|██████████████████████████ | ETA: 0:00:18Progress: 63%|██████████████████████████ | ETA: 0:00:18Progress: 64%|██████████████████████████ | ETA: 0:00:18Progress: 64%|██████████████████████████▏ | ETA: 0:00:18Progress: 64%|██████████████████████████▏ | ETA: 0:00:18Progress: 64%|██████████████████████████▏ | ETA: 0:00:18Progress: 64%|██████████████████████████▎ | ETA: 0:00:18Progress: 64%|██████████████████████████▎ | ETA: 0:00:18Progress: 64%|██████████████████████████▎ | ETA: 0:00:18Progress: 64%|██████████████████████████▍ | ETA: 0:00:17Progress: 64%|██████████████████████████▍ | ETA: 0:00:17Progress: 64%|██████████████████████████▍ | ETA: 0:00:17Progress: 64%|██████████████████████████▌ | ETA: 0:00:17Progress: 65%|██████████████████████████▌ | ETA: 0:00:17Progress: 65%|██████████████████████████▌ | ETA: 0:00:17Progress: 65%|██████████████████████████▌ | ETA: 0:00:17Progress: 65%|██████████████████████████▋ | ETA: 0:00:17Progress: 65%|██████████████████████████▋ | ETA: 0:00:17Progress: 65%|██████████████████████████▋ | ETA: 0:00:17Progress: 65%|██████████████████████████▊ | ETA: 0:00:17Progress: 65%|██████████████████████████▊ | ETA: 0:00:17Progress: 65%|██████████████████████████▊ | ETA: 0:00:17Progress: 65%|██████████████████████████▉ | ETA: 0:00:17Progress: 66%|██████████████████████████▉ | ETA: 0:00:17Progress: 66%|██████████████████████████▉ | ETA: 0:00:17Progress: 66%|███████████████████████████ | ETA: 0:00:17Progress: 66%|███████████████████████████ | ETA: 0:00:17Progress: 66%|███████████████████████████ | ETA: 0:00:17Progress: 66%|███████████████████████████ | ETA: 0:00:17Progress: 66%|███████████████████████████▏ | ETA: 0:00:17Progress: 66%|███████████████████████████▏ | ETA: 0:00:17Progress: 66%|███████████████████████████▏ | ETA: 0:00:17Progress: 66%|███████████████████████████▎ | ETA: 0:00:17Progress: 66%|███████████████████████████▎ | ETA: 0:00:17Progress: 67%|███████████████████████████▎ | ETA: 0:00:17Progress: 67%|███████████████████████████▍ | ETA: 0:00:17Progress: 67%|███████████████████████████▍ | ETA: 0:00:17Progress: 67%|███████████████████████████▍ | ETA: 0:00:17Progress: 67%|███████████████████████████▌ | ETA: 0:00:17Progress: 67%|███████████████████████████▌ | ETA: 0:00:17Progress: 67%|███████████████████████████▌ | ETA: 0:00:17Progress: 67%|███████████████████████████▌ | ETA: 0:00:17Progress: 67%|███████████████████████████▋ | ETA: 0:00:17Progress: 67%|███████████████████████████▋ | ETA: 0:00:17Progress: 67%|███████████████████████████▋ | ETA: 0:00:17Progress: 68%|███████████████████████████▊ | ETA: 0:00:17Progress: 68%|███████████████████████████▊ | ETA: 0:00:17Progress: 68%|███████████████████████████▊ | ETA: 0:00:17Progress: 68%|███████████████████████████▉ | ETA: 0:00:17Progress: 68%|███████████████████████████▉ | ETA: 0:00:17Progress: 68%|███████████████████████████▉ | ETA: 0:00:17Progress: 68%|████████████████████████████ | ETA: 0:00:17Progress: 68%|████████████████████████████ | ETA: 0:00:17Progress: 68%|████████████████████████████ | ETA: 0:00:17Progress: 68%|████████████████████████████ | ETA: 0:00:17Progress: 69%|████████████████████████████▏ | ETA: 0:00:17Progress: 69%|████████████████████████████▏ | ETA: 0:00:17Progress: 69%|████████████████████████████▏ | ETA: 0:00:17Progress: 69%|████████████████████████████▎ | ETA: 0:00:17Progress: 69%|████████████████████████████▎ | ETA: 0:00:17Progress: 69%|████████████████████████████▎ | ETA: 0:00:17Progress: 69%|████████████████████████████▍ | ETA: 0:00:17Progress: 69%|████████████████████████████▍ | ETA: 0:00:17Progress: 69%|████████████████████████████▍ | ETA: 0:00:16Progress: 69%|████████████████████████████▌ | ETA: 0:00:16Progress: 69%|████████████████████████████▌ | ETA: 0:00:16Progress: 70%|████████████████████████████▌ | ETA: 0:00:16Progress: 70%|████████████████████████████▌ | ETA: 0:00:16Progress: 70%|████████████████████████████▋ | ETA: 0:00:16Progress: 70%|████████████████████████████▋ | ETA: 0:00:16Progress: 70%|████████████████████████████▋ | ETA: 0:00:16Progress: 70%|████████████████████████████▊ | ETA: 0:00:16Progress: 70%|████████████████████████████▊ | ETA: 0:00:16Progress: 70%|████████████████████████████▉ | ETA: 0:00:16Progress: 71%|████████████████████████████▉ | ETA: 0:00:16Progress: 71%|█████████████████████████████ | ETA: 0:00:16Progress: 71%|█████████████████████████████ | ETA: 0:00:16Progress: 71%|█████████████████████████████▏ | ETA: 0:00:16Progress: 71%|█████████████████████████████▏ | ETA: 0:00:16Progress: 71%|█████████████████████████████▎ | ETA: 0:00:16Progress: 71%|█████████████████████████████▎ | ETA: 0:00:16Progress: 72%|█████████████████████████████▍ | ETA: 0:00:16Progress: 72%|█████████████████████████████▍ | ETA: 0:00:16Progress: 72%|█████████████████████████████▌ | ETA: 0:00:15Progress: 72%|█████████████████████████████▌ | ETA: 0:00:15Progress: 72%|█████████████████████████████▋ | ETA: 0:00:15Progress: 72%|█████████████████████████████▋ | ETA: 0:00:15Progress: 72%|█████████████████████████████▋ | ETA: 0:00:15Progress: 73%|█████████████████████████████▊ | ETA: 0:00:15Progress: 73%|█████████████████████████████▊ | ETA: 0:00:15Progress: 73%|█████████████████████████████▉ | ETA: 0:00:15Progress: 73%|█████████████████████████████▉ | ETA: 0:00:15Progress: 73%|██████████████████████████████ | ETA: 0:00:15Progress: 73%|██████████████████████████████ | ETA: 0:00:15Progress: 73%|██████████████████████████████▏ | ETA: 0:00:15Progress: 74%|██████████████████████████████▏ | ETA: 0:00:15Progress: 74%|██████████████████████████████▎ | ETA: 0:00:15Progress: 74%|██████████████████████████████▎ | ETA: 0:00:15Progress: 74%|██████████████████████████████▎ | ETA: 0:00:14Progress: 74%|██████████████████████████████▍ | ETA: 0:00:14Progress: 74%|██████████████████████████████▍ | ETA: 0:00:14Progress: 74%|██████████████████████████████▌ | ETA: 0:00:14Progress: 74%|██████████████████████████████▌ | ETA: 0:00:14Progress: 75%|██████████████████████████████▋ | ETA: 0:00:14Progress: 75%|██████████████████████████████▋ | ETA: 0:00:14Progress: 75%|██████████████████████████████▊ | ETA: 0:00:14Progress: 75%|██████████████████████████████▊ | ETA: 0:00:14Progress: 75%|██████████████████████████████▊ | ETA: 0:00:14Progress: 75%|██████████████████████████████▉ | ETA: 0:00:14Progress: 75%|██████████████████████████████▉ | ETA: 0:00:14Progress: 76%|███████████████████████████████ | ETA: 0:00:14Progress: 76%|███████████████████████████████ | ETA: 0:00:14Progress: 76%|███████████████████████████████▏ | ETA: 0:00:14Progress: 76%|███████████████████████████████▏ | ETA: 0:00:13Progress: 76%|███████████████████████████████▏ | ETA: 0:00:13Progress: 76%|███████████████████████████████▎ | ETA: 0:00:13Progress: 76%|███████████████████████████████▎ | ETA: 0:00:13Progress: 76%|███████████████████████████████▍ | ETA: 0:00:13Progress: 77%|███████████████████████████████▍ | ETA: 0:00:13Progress: 77%|███████████████████████████████▍ | ETA: 0:00:13Progress: 77%|███████████████████████████████▌ | ETA: 0:00:13Progress: 77%|███████████████████████████████▌ | ETA: 0:00:13Progress: 77%|███████████████████████████████▋ | ETA: 0:00:13Progress: 77%|███████████████████████████████▋ | ETA: 0:00:13Progress: 77%|███████████████████████████████▊ | ETA: 0:00:13Progress: 77%|███████████████████████████████▊ | ETA: 0:00:13Progress: 78%|███████████████████████████████▊ | ETA: 0:00:13Progress: 78%|███████████████████████████████▉ | ETA: 0:00:13Progress: 78%|███████████████████████████████▉ | ETA: 0:00:13Progress: 78%|████████████████████████████████ | ETA: 0:00:13Progress: 78%|████████████████████████████████ | ETA: 0:00:12Progress: 78%|████████████████████████████████ | ETA: 0:00:12Progress: 78%|████████████████████████████████▏ | ETA: 0:00:12Progress: 78%|████████████████████████████████▏ | ETA: 0:00:12Progress: 79%|████████████████████████████████▎ | ETA: 0:00:12Progress: 79%|████████████████████████████████▎ | ETA: 0:00:12Progress: 79%|████████████████████████████████▍ | ETA: 0:00:12Progress: 79%|████████████████████████████████▍ | ETA: 0:00:12Progress: 79%|████████████████████████████████▍ | ETA: 0:00:12Progress: 79%|████████████████████████████████▌ | ETA: 0:00:12Progress: 79%|████████████████████████████████▌ | ETA: 0:00:12Progress: 79%|████████████████████████████████▋ | ETA: 0:00:12Progress: 80%|████████████████████████████████▋ | ETA: 0:00:12Progress: 80%|████████████████████████████████▊ | ETA: 0:00:12Progress: 80%|████████████████████████████████▊ | ETA: 0:00:12Progress: 80%|████████████████████████████████▊ | ETA: 0:00:12Progress: 80%|████████████████████████████████▉ | ETA: 0:00:11Progress: 80%|████████████████████████████████▉ | ETA: 0:00:11Progress: 80%|████████████████████████████████▉ | ETA: 0:00:11Progress: 80%|█████████████████████████████████ | ETA: 0:00:11Progress: 81%|█████████████████████████████████ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▏ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▏ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▏ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▎ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▎ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▍ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▍ | ETA: 0:00:11Progress: 81%|█████████████████████████████████▍ | ETA: 0:00:11Progress: 82%|█████████████████████████████████▌ | ETA: 0:00:11Progress: 82%|█████████████████████████████████▌ | ETA: 0:00:11Progress: 82%|█████████████████████████████████▌ | ETA: 0:00:11Progress: 82%|█████████████████████████████████▋ | ETA: 0:00:10Progress: 82%|█████████████████████████████████▋ | ETA: 0:00:10Progress: 82%|█████████████████████████████████▊ | ETA: 0:00:10Progress: 82%|█████████████████████████████████▊ | ETA: 0:00:10Progress: 82%|█████████████████████████████████▊ | ETA: 0:00:10Progress: 83%|█████████████████████████████████▉ | ETA: 0:00:10Progress: 83%|█████████████████████████████████▉ | ETA: 0:00:10Progress: 83%|██████████████████████████████████ | ETA: 0:00:10Progress: 83%|██████████████████████████████████ | ETA: 0:00:10Progress: 83%|██████████████████████████████████ | ETA: 0:00:10Progress: 83%|██████████████████████████████████▏ | ETA: 0:00:10Progress: 83%|██████████████████████████████████▏ | ETA: 0:00:10Progress: 83%|██████████████████████████████████▏ | ETA: 0:00:10Progress: 83%|██████████████████████████████████▎ | ETA: 0:00:10Progress: 84%|██████████████████████████████████▎ | ETA: 0:00:10Progress: 84%|██████████████████████████████████▍ | ETA: 0:00:10Progress: 84%|██████████████████████████████████▍ | ETA: 0:00:10Progress: 84%|██████████████████████████████████▍ | ETA: 0:00:09Progress: 84%|██████████████████████████████████▌ | ETA: 0:00:09Progress: 84%|██████████████████████████████████▌ | ETA: 0:00:09Progress: 84%|██████████████████████████████████▌ | ETA: 0:00:09Progress: 84%|██████████████████████████████████▋ | ETA: 0:00:09Progress: 85%|██████████████████████████████████▋ | ETA: 0:00:09Progress: 85%|██████████████████████████████████▊ | ETA: 0:00:09Progress: 85%|██████████████████████████████████▊ | ETA: 0:00:09Progress: 85%|██████████████████████████████████▊ | ETA: 0:00:09Progress: 85%|██████████████████████████████████▉ | ETA: 0:00:09Progress: 85%|██████████████████████████████████▉ | ETA: 0:00:09Progress: 85%|███████████████████████████████████ | ETA: 0:00:09Progress: 85%|███████████████████████████████████ | ETA: 0:00:09Progress: 85%|███████████████████████████████████ | ETA: 0:00:09Progress: 86%|███████████████████████████████████▏ | ETA: 0:00:09Progress: 86%|███████████████████████████████████▏ | ETA: 0:00:09Progress: 86%|███████████████████████████████████▏ | ETA: 0:00:08Progress: 86%|███████████████████████████████████▎ | ETA: 0:00:08Progress: 86%|███████████████████████████████████▎ | ETA: 0:00:08Progress: 86%|███████████████████████████████████▍ | ETA: 0:00:08Progress: 86%|███████████████████████████████████▍ | ETA: 0:00:08Progress: 86%|███████████████████████████████████▍ | ETA: 0:00:08Progress: 86%|███████████████████████████████████▌ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▌ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▌ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▋ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▋ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▋ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▊ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▊ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▊ | ETA: 0:00:08Progress: 87%|███████████████████████████████████▉ | ETA: 0:00:08Progress: 88%|███████████████████████████████████▉ | ETA: 0:00:08Progress: 88%|███████████████████████████████████▉ | ETA: 0:00:07Progress: 88%|████████████████████████████████████ | ETA: 0:00:07Progress: 88%|████████████████████████████████████ | ETA: 0:00:07Progress: 88%|████████████████████████████████████ | ETA: 0:00:07Progress: 88%|████████████████████████████████████▏ | ETA: 0:00:07Progress: 88%|████████████████████████████████████▏ | ETA: 0:00:07Progress: 88%|████████████████████████████████████▎ | ETA: 0:00:07Progress: 88%|████████████████████████████████████▎ | ETA: 0:00:07Progress: 88%|████████████████████████████████████▎ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▍ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▍ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▍ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▌ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▌ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▌ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▋ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▋ | ETA: 0:00:07Progress: 89%|████████████████████████████████████▋ | ETA: 0:00:06Progress: 90%|████████████████████████████████████▊ | ETA: 0:00:06Progress: 90%|████████████████████████████████████▊ | ETA: 0:00:06Progress: 90%|████████████████████████████████████▊ | ETA: 0:00:06Progress: 90%|████████████████████████████████████▉ | ETA: 0:00:06Progress: 90%|████████████████████████████████████▉ | ETA: 0:00:06Progress: 90%|████████████████████████████████████▉ | ETA: 0:00:06Progress: 90%|█████████████████████████████████████ | ETA: 0:00:06Progress: 90%|█████████████████████████████████████ | ETA: 0:00:06Progress: 90%|█████████████████████████████████████ | ETA: 0:00:06Progress: 90%|█████████████████████████████████████▏ | ETA: 0:00:06Progress: 91%|█████████████████████████████████████▏ | ETA: 0:00:06Progress: 91%|█████████████████████████████████████▏ | ETA: 0:00:06Progress: 91%|█████████████████████████████████████▎ | ETA: 0:00:06Progress: 91%|█████████████████████████████████████▎ | ETA: 0:00:06Progress: 91%|█████████████████████████████████████▎ | ETA: 0:00:06Progress: 91%|█████████████████████████████████████▍ | ETA: 0:00:06Progress: 91%|█████████████████████████████████████▍ | ETA: 0:00:05Progress: 91%|█████████████████████████████████████▍ | ETA: 0:00:05Progress: 91%|█████████████████████████████████████▌ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▌ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▋ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▋ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▋ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▋ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▊ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▊ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▊ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▉ | ETA: 0:00:05Progress: 92%|█████████████████████████████████████▉ | ETA: 0:00:05Progress: 93%|█████████████████████████████████████▉ | ETA: 0:00:05Progress: 93%|██████████████████████████████████████ | ETA: 0:00:05Progress: 93%|██████████████████████████████████████ | ETA: 0:00:05Progress: 93%|██████████████████████████████████████ | ETA: 0:00:05Progress: 93%|██████████████████████████████████████▏ | ETA: 0:00:04Progress: 93%|██████████████████████████████████████▏ | ETA: 0:00:04Progress: 93%|██████████████████████████████████████▏ | ETA: 0:00:04Progress: 93%|██████████████████████████████████████▎ | ETA: 0:00:04Progress: 93%|██████████████████████████████████████▎ | ETA: 0:00:04Progress: 93%|██████████████████████████████████████▎ | ETA: 0:00:04Progress: 93%|██████████████████████████████████████▍ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▍ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▍ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▍ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▌ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▌ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▌ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▋ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▋ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▋ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▊ | ETA: 0:00:04Progress: 94%|██████████████████████████████████████▊ | ETA: 0:00:04Progress: 95%|██████████████████████████████████████▊ | ETA: 0:00:03Progress: 95%|██████████████████████████████████████▉ | ETA: 0:00:03Progress: 95%|██████████████████████████████████████▉ | ETA: 0:00:03Progress: 95%|██████████████████████████████████████▉ | ETA: 0:00:03Progress: 95%|██████████████████████████████████████▉ | ETA: 0:00:03Progress: 95%|███████████████████████████████████████ | ETA: 0:00:03Progress: 95%|███████████████████████████████████████ | ETA: 0:00:03Progress: 95%|███████████████████████████████████████ | ETA: 0:00:03Progress: 95%|███████████████████████████████████████▏ | ETA: 0:00:03Progress: 95%|███████████████████████████████████████▏ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▏ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▎ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▎ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▎ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▍ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▍ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▍ | ETA: 0:00:03Progress: 96%|███████████████████████████████████████▍ | ETA: 0:00:02Progress: 96%|███████████████████████████████████████▌ | ETA: 0:00:02Progress: 96%|███████████████████████████████████████▌ | ETA: 0:00:02Progress: 96%|███████████████████████████████████████▌ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▋ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▋ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▋ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▊ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▊ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▊ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▉ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▉ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▉ | ETA: 0:00:02Progress: 97%|███████████████████████████████████████▉ | ETA: 0:00:02Progress: 97%|████████████████████████████████████████ | ETA: 0:00:02Progress: 98%|████████████████████████████████████████ | ETA: 0:00:02Progress: 98%|████████████████████████████████████████ | ETA: 0:00:02Progress: 98%|████████████████████████████████████████▏| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▏| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▏| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▏| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▎| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▎| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▎| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▍| ETA: 0:00:01Progress: 98%|████████████████████████████████████████▍| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▍| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▌| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▌| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▌| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▋| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▋| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▋| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▋| ETA: 0:00:01Progress: 99%|████████████████████████████████████████▊| ETA: 0:00:00Progress: 99%|████████████████████████████████████████▊| ETA: 0:00:00Progress: 100%|████████████████████████████████████████▊| ETA: 0:00:00Progress: 100%|████████████████████████████████████████▉| ETA: 0:00:00Progress: 100%|████████████████████████████████████████▉| ETA: 0:00:00Progress: 100%|████████████████████████████████████████▉| ETA: 0:00:00Progress: 100%|█████████████████████████████████████████| ETA: 0:00:00Progress: 100%|█████████████████████████████████████████| ETA: 0:00:00Progress: 100%|█████████████████████████████████████████| Time: 0:01:06
Next I create a dictionary with the families from the target set as keys. For each target family, I prune the Glottolog world tree to the glottocodes from that family and store it as a newick string in the dictionary.
function family_to_tree(fm; d=d, glot=glot)
fm_taxa = d.glottocode[d.Family.==fm]
glot_fm = glot.copy()
glot_fm.prune(fm_taxa)
glot_fm.write(format=9)
end
glot_tree_dict = Dict()
for fm in families
@info fm
glot_tree_dict[fm] = family_to_tree(fm)
end
[ Info: Kadugli_Krongo
[ Info: Songhay
[ Info: Basque
[ Info: Abkhaz_Adyge
[ Info: Hmong_Mien
[ Info: Tai_Kadai
[ Info: Ndu
[ Info: Koiarian
[ Info: Greater_Kwerba
[ Info: Chinookan
[ Info: Palaihnihan
[ Info: Maiduan
[ Info: Yuki_Wappo
[ Info: Miwok_Costanoan
[ Info: Mixe_Zoque
[ Info: Tucanoan
[ Info: Chonan
[ Info: Matacoan
[ Info: Bororoan
[ Info: Khoe_Kwadi
[ Info: Nubian
[ Info: Surmic
[ Info: South_Omotic
[ Info: Kartvelian
[ Info: Chukotko_Kamchatkan
[ Info: Wintuan
[ Info: Pomoan
[ Info: Yokutsan
[ Info: Iroquoian
[ Info: Pano_Tacanan
[ Info: Guaicuruan
[ Info: Saharan
[ Info: Japonic
[ Info: Sahaptian
[ Info: Caddoan
[ Info: Muskogean
[ Info: Yanomamic
[ Info: Kru
[ Info: Heibanic
[ Info: Wakashan
[ Info: Keresan
[ Info: Otomanguean
[ Info: Chibchan
[ Info: Ta_Ne_Omotic
[ Info: Mongolic_Khitan
[ Info: Tungusic
[ Info: Pama_Nyungan
[ Info: Kiowa_Tanoan
[ Info: Nuclear_Macro_Je
[ Info: Cochimi_Yuman
[ Info: Mayan
[ Info: Nuclear_Trans_New_Guinea
[ Info: Tupian
[ Info: Dravidian
[ Info: Siouan
[ Info: Turkic
[ Info: Cariban
[ Info: Central_Sudanic
[ Info: Arawakan
[ Info: Eskimo_Aleut
[ Info: Austroasiatic
[ Info: Uralic
[ Info: Salishan
[ Info: Mande
[ Info: Athabaskan_Eyak_Tlingit
[ Info: Uto_Aztecan
[ Info: Sino_Tibetan
[ Info: Algic
[ Info: Nilotic
[ Info: Indo_European
[ Info: Afro_Asiatic
[ Info: Austronesian
[ Info: Atlantic_Congo
Finally the MrBayes
files are created.
For each family, there are three nexus files:
MrBayes
scripts.Using two MrBayes
scripts is a hack because
If I use the early stop rule from the outset, sampling for very small families will stop right away because there is no (or little) phylogenetic uncertainty. It is still advisable to do some sampling though, to get a good estimate for the branch lengths.
Not using the stop rule is not a good option either, because then I have to fix a sufficiently large number of iterations. For large families, this is in the tens of millions, which would be a waste of resources for smaller families.
As a compromise, I use the first script to run 10, 000,000 iterations without stop rule, and then continue up to 1,000,000,000 iterations with stop rule.
For the latter, I prepare the following analysis:
function create_mb_script(
fn,
char_mtx,
clades,
fm_cc,
fm_glottocodes,
n_iterations,
n_runs,
n_chains,
append,
stoprule
)
mb = """#Nexus
Begin MrBayes;
execute $fn.nex;
charset sc = 1-1640;
charset cc = 1641-$(size(char_mtx, 2)-1);
partition dtype = 2:sc, cc;
set partition = dtype;
prset applyto=(all) brlenspr = clock:uniform;
prset applyto=(all) clockvarpr = igr;
lset applyto=(all) rates=gamma;
unlink Statefreq=(all) shape=(all) igrvar=(all) rate=(all);
prset applyto=(all) ratepr=Dirichlet(1, 1);
prset applyto=(2) clockratepr=exp(1.0); [for partition 2]
lset applyto=(1) coding=all;
lset applyto=(2) coding=noabsencesites;
"""
if length(clades) > 1
for (i, cl) in enumerate(clades)
cn = join(cl, " ")
mb *= " constraint c$i = "
mb *= "$cn;\n"
end
mb *= " prset topologypr = constraints("
mb *= join(["c$i" for i in 1:length(clades)], ",") * ");\n"
end
if length(fm_glottocodes) == 2
mb *= "constraint c1 = $(join(fm_cc.Glottocode, " "));\n"
mb *= "prset topologypr = constraints(c1);\n"
end
if length(fm_glottocodes) > 100
mb *= " set beagleprecision=double;\n"
end
mb *= """ prset brlenspr = clock:uniform;
prset clockvarpr = igr;
mcmcp stoprule=$stoprule stopval=0.05 filename=output/$fn samplefreq=1000;
mcmc ngen=$n_iterations nchains=$n_chains nruns=$n_runs append=$append;
sumt;
sump;
q;
end;
"""
return mb
end
mkpath("mrbayes/output")
@showprogress for (i, fm) in enumerate(families)
fm_glottocodes = d.glottocode[d.Family.==fm]
fn = lpad(i, 3, "0")*"_"*fm
fm_cc = @pipe world_cc |>
filter(row -> row.Glottocode ∈ fm_glottocodes, _)
fm_characters = names(fm_cc)[2:end]
informative = map(x -> sum(string.(fm_cc[:,x]) .== "1") .> 0, fm_characters)
fm_cc = select(
fm_cc,
vcat(["Glottocode"], fm_characters[informative])
)
fm_sc = @pipe world_sc |>
filter(row -> row.Glottocode ∈ fm_glottocodes, _)
fm_characters = [x for x in names(fm_sc) if x != "Glottocode"]
fm_sc = select(
fm_sc,
vcat(["Glottocode"], fm_characters)
)
char_mtx = innerjoin(
fm_sc,
fm_cc,
on=:Glottocode => :Glottocode,
makeunique=true,
)
fm_glot = ete3.Tree(glot_tree_dict[fm], format=1)
internal_nodes = [
nd for nd in fm_glot.traverse()
if nd.is_leaf() == false && nd.is_root() == false
]
clades = [x.get_leaf_names() for x in internal_nodes]
n_iterations_head = 10_000_000
n_iterations_tail = 1_000_000_000
n_chains = length(fm_glottocodes) > 100 ? 4 : 2
n_runs = length(fm_glottocodes) > 100 ? 4 : 2
mb_head = create_mb_script(
fn,
char_mtx,
clades,
fm_cc,
fm_glottocodes,
n_iterations_head,
n_runs,
n_chains,
"no",
"no"
)
mb_tail = create_mb_script(
fn,
char_mtx,
clades,
fm_cc,
fm_glottocodes,
n_iterations_tail,
n_runs,
n_chains,
"yes",
"yes"
)
write("mrbayes/$(fn)_head.mb.nex", mb_head)
write("mrbayes/$(fn)_tail.mb.nex", mb_tail)
write("mrbayes/$fn.nex", df2nexus(char_mtx))
end
Progress: 3%|█▏ | ETA: 0:00:18Progress: 15%|██████▏ | ETA: 0:00:04Progress: 26%|██████████▋ | ETA: 0:00:02Progress: 37%|███████████████▏ | ETA: 0:00:02Progress: 47%|███████████████████▏ | ETA: 0:00:01Progress: 56%|███████████████████████ | ETA: 0:00:01Progress: 64%|██████████████████████████▍ | ETA: 0:00:01Progress: 73%|█████████████████████████████▊ | ETA: 0:00:01Progress: 79%|████████████████████████████████▋ | ETA: 0:00:00Progress: 86%|███████████████████████████████████▍ | ETA: 0:00:00Progress: 92%|█████████████████████████████████████▋ | ETA: 0:00:00Progress: 95%|██████████████████████████████████████▊ | ETA: 0:00:00Progress: 97%|███████████████████████████████████████▉ | ETA: 0:00:00Progress: 99%|████████████████████████████████████████▌| ETA: 0:00:00Progress: 100%|█████████████████████████████████████████| Time: 0:00:03
This completes data preparation for MrBayes
.
In the next step, all MrBayes
scripts must be run, ideally with as much parallelization as possible. I used the following shell script on a powerfuls server for this:
#!/bin/bash
# Script to run multiple instances of mb-mpi command using parallel processing
cd mrbayes
max_jobs=25
run_with_limit() {
while [ "$(jobs | wc -l)" -ge "$max_jobs" ]; do
sleep 1
done
mpirun -np "$1" mb-mpi "$2" &
}
# Main loop to run mb-mpi commands in parallel
for file in *_head.mb.nex; do
if [[ "$file" == "052_Atlantic_Congo_head.mb.nex" || "$file" == "053_Austronesian_head.mb.nex" ]]; then
run_with_limit 16 "$file"
else
run_with_limit 4 "$file"
fi
done
wait
for file in *_tail.mb.nex; do
if [[ "$file" == "052_Atlantic_Congo_tail.mb.nex" || "$file" == "053_Austronesian_tail.mb.nex" ]]; then
run_with_limit 16 "$file"
else
run_with_limit 4 "$file"
fi
done
echo "All jobs submitted."