Home New Trending Search
About Privacy Terms
#
#textdata
Posts tagged #textdata on Bluesky
Post image

No existing corpus that fits your niche research topic? Build your own corpus! With seed words, the corpus theme might be anything – even Christmas. www.sketchengine.eu/guide/create...
#textdata #textcorpus

1 0 0 1
Preview
Measuring Scalar Constructs in Social Science with LLMs Many constructs that characterize language, like its complexity or emotionality, have a naturally continuous semantic structure; a public speech is not just "simple" or "complex," but exists on a cont...

Very interesting work on extracting 'scalar constructs' from #TextData with #LLMs by @haukelicht.bsky.social and colleagues: arxiv.org/abs/2509.03116

3 1 1 0
Post image

Are you an R user tired of missing out on the LLM craze?

In my new tutorial I show how to use OpenAI’s GPT and Google’s Gemini models to classify political texts. I connect to the APIs directly from R using reticulate.

alhdzsz.net/posts/llms_r...

#rstats #dataviz #ai #python #textdata

3 2 1 0

Shoutout to @ivelasq3.bsky.social and @posit.co for the opportunity to write a blog post about how I'm using `library(mall)` and integrating large language models into our energy security research! #textdata #LLM #energy #energysecurity #socialscience #datascience #NLProc

17 3 2 0

In this paper, we demonstrate that different #topics in #textdata exhibit varying degrees of #geospatiality, with some containing more #geographic mentions or #geotagged #locations than others.

1 0 1 0
The new Lithuanian Web corpus 2021.

The new Lithuanian Web corpus 2021.

Following our recent update for Lithuanian, we’re introducing the new Lithuanian Web corpus 2021! It's lemmatized, part-of-speech tagged, and classified by genres and topics.
#corpuslinguistics #digitalhumanities #textdata
www.sketchengine.eu/lttenten-lit...

2 0 0 0
Preview
Clustering Swap Prediction for Image-Text Pre-Training Researchers introduced Clus, a novel clustering swap prediction strategy for learning an image-text embedding space, which leverages distillation learning to achieve state-of-the-art performance in ta...

1/2. 🖼️📝🤖 AI advances with clustering swap prediction for image-text pre-training, enhancing data efficiency and model performance. www.azoai.com/news/2024053... #AI #Innovation #Technology #MachineLearning #DataScience #Efficiency #ModelTraining #VisualData #TextData #Future

0 0 1 0
Post image

Sources reveal that OpenAI has explored training GPT-5 on public YouTube video transcripts. Additionally, experts suggest that the AI industry's demand for high-quality text data could surpass supply within the next two years. #OpenAI #GPT5 #YouTubeTranscripts #TextData #AIIndustry

0 0 0 0

Frequently cited QLR papers (2019):

1) Latent Dirichlet Allocation for #textdata


2) Test-retest reliability via intraclass correlation


3) #SysReview of impacts of patient reported outcome measures


#HRQoL

0 0 0 0