SILICON @Stanford's Avatar

SILICON @Stanford

@stanfordsilicon

Stanford Initiative on Language Inclusion and Conservation in Old and New Media | Advancing Digitally Disadvantaged Languages @Stanford

119
Followers
15
Following
260
Posts
12.08.2024
Joined
Posts Following

Latest posts by SILICON @Stanford @stanfordsilicon

The phenomenon of "I can't not do this"... until "I can't do this" underpins so much of this work. #FaceInterface

19.01.2025 02:15 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 1

Audience comment: Look at the foundations you're building it on. With software, realized it wasn't a stable foundation and had to rewrite a lot of code. A tool nearly collapsed because of MacOS updates, took multiple people 18 months to fix, nearly crippled a whole corner of the ecosystem.

19.01.2025 02:14 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Audience comment: so many projects are passions and hobbies, 80%+ donated time. Started thinking about business models and revenue streams from day 1, instead of making it and open-sourcing it and realizing I couldn't maintain it any longer. #FaceInterface

19.01.2025 02:13 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

@tsmullaney.bsky.social Little taxes here and there (SFO landing fees, ICANN fees) enable people to get things done. This is how ICANN can run weeklong workshops to sort out non-Latin script domains, or curated exhibits at SFO airport. No one's going to argue about .99 surcharge on a plane ticket.

19.01.2025 02:11 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Marc Weber: the question of freezing language in amber -- the ship sailed with writing. If you don't update to the current changes in writing, you're doing harm. It's a much more weighty decision to orphan the languages. Necessary but not sufficient to encode languages. #FaceInterface

19.01.2025 02:06 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Hrant Papazian: "Preservation is for museums. Young people don't like to go to museums because they don't have a past. Young people want to make a future, and the future is the only way out. We have to build the future and get tools for the future. Young women matter more for NΓΌshu." #FaceInterface

19.01.2025 02:02 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

@tsmullaney.bsky.social "Open source" is the great-great-grandchild of this romantic idea, there are communities that don't buy it. How does one confront this conflict between "shared heritage" and linguistic/cultural ownership. "Uni-" has a history, are we structurally repeating? #FaceInterface

19.01.2025 02:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

@tsmullaney.bsky.social Notion of shared human heritage is a free market concept, an open sound-stage of existence. Romantic vision of a shared destiny, beauty, etc. But when that vision was literally guiding principle of marching gunboats to remove an obstacle... πŸ‘€ #FaceInterface

19.01.2025 01:58 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Kamal Mansour: "What we're speaking about here is enabling digital writing, which is not the same as language. By representing the writing of particular languages in Unicode, enable them to create digital patrimony if they choose. It doesn't mean that's all they create." #FaceInterface

19.01.2025 01:54 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Audience comment: "I'm constantly running into how Unicode screws up my world. It doesn't take into consideration the full things necessary for the expression of the language. Nothing about the rules of typography and representation in layout. When fonts disagree you have chaos." #FaceInterface

19.01.2025 01:53 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Anushah Hossain: Holding up Unicode inclusion or digitization as the ultimate goal... is that too simplistic? What's the goal we're trying to achieve?

19.01.2025 01:52 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Peter Constable: "We should be enablers of local choice. And yes, that means that some languages will die. Make sure local communities know they have choices, choices are there, and we're there to support when they choose a path where we can help." #FaceInterface

19.01.2025 01:50 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Discussion question: Are we preserving language or enabling communication? Are we creating a Tower of Babel? Are we ossifying language through the unchanging nature of Unicode? #FaceInterface

19.01.2025 01:48 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

@tsmullaney.bsky.social "Do you want to see the world saved, or do you want to be the one who saved the world?" In a structural way, this leads to the fragmentation and silo-ization. Ego plays a part. What organization says 'let's merge, and I'll take your name'?" #FaceInterface

19.01.2025 01:45 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

@tsmullaney.bsky.social wrapping up #FaceInterface. This is the third iteration of the conference: urge to create "compostable" organizations, whose work can obviate the need for it to exist. Can get drunk on the legacy of it, never really want to fully solve the problem, want people to come back.

19.01.2025 01:35 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Related to LoCoS is a preceding and influential pictographic writing system, Blissymbolics: letterformarchive.org/news/blissym... #FaceInterface

19.01.2025 01:24 πŸ‘ 5 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Image from the Face / Interface conference of Dr. Tawfik Jelassi, UNESCO Assistant Director-General for Communication and Information

Image from the Face / Interface conference of Dr. Tawfik Jelassi, UNESCO Assistant Director-General for Communication and Information

β€œLinguistic diversity is not just a technical challenge to be solved; it is a cultural treasure to be cherished” β€” Dr. Tawfik Jelassi, UNESCO Assistant Director-General for Communication and Information #FaceInterface #langsky

17.01.2025 22:58 πŸ‘ 6 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

@tsmullaney.bsky.social at #FaceInterface "Privilege is the ability to move onto cooler problems", in the context of digitally-disadvantaged languages. For English we can worry about VR/AR, color fonts, etc. Most languages still need basic OCR/HTR/NLP. #MultilingualDH

19.01.2025 01:23 πŸ‘ 11 πŸ” 4 πŸ’¬ 0 πŸ“Œ 0
Embroidered cover

Embroidered cover

Close up of embroidered cover

Close up of embroidered cover

NΓΌshu script

NΓΌshu script

NΓΌshu script

NΓΌshu script

Eason Lu passed around a souvenir example of NΓΌshu script / 3rd day letter (with some embroidery on the cover! #DHmakes #FaceInterface #MultilingualDH

19.01.2025 00:55 πŸ‘ 16 πŸ” 5 πŸ’¬ 2 πŸ“Œ 0
dolma

dolma

Sina Ahmadi closes out the #FaceInterface slides the best way: "these are some dolma my wife and I made."

19.01.2025 01:11 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Sina Ahmadi: Goal is to create a machine translation system. Limited amount of data, so fine-tuning existing models. Meta's No Language Left Behind model covers 200 languages. Super-low BLEU score for these languages with NLLB, fine-tuning had big improvements. #FaceInterface

19.01.2025 01:08 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Sina Ahmadi: Hawrami had almost 10 people contributing. 46 hours of speech data collected using DOLMA speech bot. Used same multilingual corpus, asked people to select language and read sentences. 28k utterances! #FaceInterface

19.01.2025 01:06 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Sina Ahmadi: Gave volunteers a set of sentences in a highly resourced language they know and in English. Community-driven multilingual parallel corpus, > 50,000 sentences total. Previously some of the languages only had 100 sentences online. All sentences aligned with English. #FaceInterface

19.01.2025 01:05 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Sina Ahmadi: Some skepticism, "adding fuel to cultural hegemony of Turkish language". (Someone on Reddit was mad because of dolma reference.) Dolma is a food, but thought it'd be a good name because it'd be outside of politics. It's an acronym! Nothing to do with Turkish! #FaceInterface

19.01.2025 01:02 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Sina Ahmadi: Vision was community building, data collection, NLP development, scientific dissemination, sustainability & impact -- in that order. Intensive outreach campaign last fall, publishers, language experts, academics, native speakers. 30 highly active volunteers. #FaceInterface

19.01.2025 01:01 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
NLP support

NLP support

languages

languages

Last but not least, SILICON Practitioner Sina Ahmadi on "Dolma-NLP: Developing Language Technologies for Middle Eastern Languages". Middle East is a linguistically complex place, rich diversity of languages. Only a few are officially recognized. Goal is sustainable lang tech community #FaceInterface

19.01.2025 00:59 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

LoΓ―c Marleix: As Ota said at the end of his book, the work cannot be done alone, and it's up to us to continue.

19.01.2025 00:51 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

More on NΓΌshu from Lisa Huang’s 2021 Letterform Lecture: letterformarchive.org/events/view/...

18.01.2025 23:58 πŸ‘ 32 πŸ” 10 πŸ’¬ 1 πŸ“Œ 1

One thing that makes @stanfordsilicon.bsky.social’s #FaceInterface a unique conference: despite hailing from many different disciplines (engineering, design, linguistics), almost everyone here knows by heart the difference between β€œcharacter” & β€œglyph”.

(Okay, maybe also the Unicode Conference.)

19.01.2025 00:29 πŸ‘ 6 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0

LoΓ―c Marleix: Next week, symbol maker will be available. Database will be available, open-source, no gatekeeping here. Website is just the starting point; Ota's goals 15 years ago are still valid now. Wanted Ota to see that people are still interested. #FaceInterface

19.01.2025 00:50 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0