Library of Congress Innovator in Residence graphic
Calling all technologists, artists and other creative visionaries:
The Library of Congress has opened the call for the next Innovator in Residence!
Apply by 2pm ET April 10
newsroom.loc.gov/news/library...
03.03.2026 19:28
π 15
π 26
π¬ 1
π 0
CNI Fall 2025 Membership Meeting - YouTube
CNI Fall 2025 Membership Meeting videos at the Hyatt Regency on Capitol Hill Washington, DC. Learn more at: https://www.cni.org/mm/fall-2025
The plenary videos from the CNI meeting are now available:
π€Shaping CNIβs Future Together
πA Landscape of AI in Libraries @bcgl.bsky.social
πAUPresses Stand UP Award
@brettbobley.bsky.social @aupresses.bsky.social
π°The State of Funding for US Higher Ed, Science, and Technology in a Time of Change
19.12.2025 18:21
π 8
π 3
π¬ 0
π 0
Publication day! My article on how to read an 18C newspapers, on digital remediation, and on the unfree press out in the world. Thanks @andy-schocket.bsky.social @historymatterssyd.bsky.social & M. Karrs for making it all possible. @universitypress.cambridge.org
lnkd.in/eEwbdbSw
17.12.2025 15:54
π 13
π 4
π¬ 0
π 0
Figured Iβd wait to post anything for this, but I canβt tell you how excited I am for us to have the opportunity to be funded to do this work. We really do believe that our project connecting communities, digital archives of the early Black press, and human-AI systems will have a large impact
12.12.2025 21:39
π 4
π 3
π¬ 1
π 0
I really appreciate your kind words!
11.12.2025 22:55
π 1
π 0
π¬ 0
π 0
For "Communities in the Loop: AI for Cultures & Contexts in Multimodal Archives," congrats to: Jim Casey, Christopher Dancy, @snblickhan.bsky.social, Tiffany Smith, Benjamin Lee. (And I must also include @profgabrielle.bsky.social !)
11.12.2025 17:32
π 7
π 2
π¬ 1
π 0
Seconding all of this β so much fun not work with so many people I deeply admire, and we canβt wait to share out as our project progresses!
11.12.2025 16:13
π 6
π 0
π¬ 0
π 0
I can confirm, thinking with this crew is honestly wonderful, 10/10, Dream Team experience. I have more to say but a meeting to facilitate! More soon.
11.12.2025 15:59
π 2
π 1
π¬ 0
π 0
Honored to be giving an opening plenary talk on AI & libraries later today at CNI 2025! Iβm excited for the conversation with Kate Zwaard and for the full program by @cni-org.bsky.social!
www.cni.org/events/membe...
11.12.2025 16:10
π 8
π 1
π¬ 1
π 0
Schmidt Sciences awards $750,000 to UCSB-led team to transform Black press archives with AI | Division of Humanities and Fine Arts
English professor Jim Casey leads a national coalition to recover 19th-century African American newspapers using machine learning and public crowdsourcing.
I am beyond thrilled to share some good news:
1. I've moved to a new job at UC Santa Barbara.
2. We have been selected for a 2025 Humanities and AI Virtual
Institute award from @schmidtsciences.bsky.social!
Both, we hope, will allow us to continue building the work! +
hfa.ucsb.edu/news/schmidt...
11.12.2025 15:21
π 62
π 8
π¬ 11
π 2
Thank you to @datarescueproject.org for publishing this blog post by @kdeeds.bsky.social and myself on GovScape! Extremely grateful to @datarescueproject.org for all their incredible work!
02.12.2025 17:40
π 5
π 2
π¬ 0
π 0
Very excited to have a software paper with @yh-huang.bsky.social in the CHR journal on the Digital Collections Explorer, our open-source multimodal viewer for digital collections!
02.12.2025 17:36
π 10
π 1
π¬ 0
π 0
New Research Tool: GovScape (US Gov PDFs)
govscape.net ||| Research Paper (preprint) About #GovScape arxiv.org/abs/2511.11010 #govdocs @eotarchive.org
19.11.2025 14:20
π 7
π 6
π¬ 0
π 0
Anyone interested in govt transparency and public access should check out GovScape from @bcgl.bsky.social and his teamπ
It's an incredibly powerful tool that allows visual, semantic text, and keywords search of 10 million U.S. government PDFs (70 million pages!) and counting: www.govscape.net
19.11.2025 18:24
π 51
π 26
π¬ 1
π 1
Thanks so much! Truly appreciate it!
19.11.2025 19:09
π 1
π 0
π¬ 0
π 0
Thanks so much!
19.11.2025 03:09
π 0
π 0
π¬ 0
π 0
Weβre live! Search 10 million+ U.S. government PDFs (70 million pages)! GovScape offers visual search, semantic text search, and keyword search. Explore below:
Website: govscape.net
ArXiv link: arxiv.org/abs/2511.11010
18.11.2025 21:16
π 17
π 5
π¬ 0
π 1
Huge step forward in enabling access and use of content archived from government websites!
18.11.2025 20:27
π 11
π 3
π¬ 0
π 0
7/ Lastly, weβd love to hear your feedback on GovScape at bcgl@uw.edu! For more updates on GovScape, follow: @govscape.bsky.social
18.11.2025 20:19
π 6
π 1
π¬ 0
π 0
7/ A particular thank-you to @kdeeds.bsky.social for leading this project with me and for making this possible! And to @yh-huang.bsky.social, who did an incredible job with the front-end and dev-ops!
18.11.2025 20:19
π 4
π 1
π¬ 1
π 0
6/ GovScape is the result of a multidisciplinary collaboration, co-led by myself and @kdeeds.bsky.social. Weβre enormously grateful to the team: Ying-Hsiang Huang, Claire Gong, Shreya Shaji, Alison Yan, Leslie Harka, @tjowens.bsky.social, @vphill.bsky.social, @shannonshen.bsky.social, and SJ Klein!
18.11.2025 20:19
π 5
π 1
π¬ 1
π 0
GovScape: A Tutorial Video
YouTube video by GovScape
5/ Interested in learning more? Visit GovScape at: www.govscape.net β try some searches and read the FAQ! You can also watch a demo video here: www.youtube.com/watch?v=mNda...
18.11.2025 20:19
π 5
π 1
π¬ 1
π 0
A visual search for "redacted documents" showing a number of documents with heavy redactions.
4/ What does visual search do? Hereβs a visual search for βredacted documentsβ
18.11.2025 20:19
π 4
π 1
π¬ 1
π 0
A diagram showing the GovScape architecture, including the client, server, and databases.
3/ The full GovScape architecture is detailed in this figure, showing how the client interacts with the server, DBs, and indices. We utilize FAISS for text embeddings and for CLIP embeddings, and SQLite FTS5 for keyword indexing.
18.11.2025 20:19
π 5
π 1
π¬ 1
π 0
An diagram showing the GovScape PDF pre-processing pipeline, including PDF identification, rendering, and semantification (including embedding generation).
2/ The pre-processing pipeline ingests PDFs, renders them, generates CLIP and BGE embeddings of individual pages, and indexes the text. The total compute cost for GovScape's pre-processing pipeline for 10 million PDFs was approximately $1,500. Our code is available at: github.com/bcglee/govsc....
18.11.2025 20:19
π 8
π 1
π¬ 1
π 0
A diagram showing the three central query methods within GovScape: semantic text search, visual search, and keyword search.
2/ GovScape is built on top of the End of Term Web Archive (eotarchive.org) and currently contains all renderable PDFs (50 pages or fewer) from the 2020 crawl, documenting the first Trump administration. An overview of GovScapeβs search functionality can be found in this diagram.
18.11.2025 20:19
π 3
π 3
π¬ 1
π 0
1/ Announcing GovScape β a public search system for 10 million U.S. government PDFs (70 million pages)! GovScape offers visual search, semantic text search, and keyword search. Explore below:
Website: www.govscape.net
ArXiv link: arxiv.org/abs/2511.11010
18.11.2025 20:19
π 80
π 35
π¬ 3
π 4
Thanks so much, I appreciate it!
11.10.2025 21:37
π 1
π 0
π¬ 1
π 0