corpusdelespanol.org
Corpus del Español: 2 billion words: Dialects / Genres / Historical
http://www.corpusdelespanol.org/x.asp
El corpus del español. Compare to other corpora. Created by Mark Davies,. Funded by the US National Endowment for the Humanities. 2001, 2015). Part of the BYU collection of corpora. As of Summer 2016, the Corpus del Español has two different parts:. The (original, smaller) corpus that allows you to look at historical changes and genre-based variation. The (new, much larger) corpus that you can use to look at dialectal variation (and have 100x as much data for Modern Spanish).
wordfrequency.info
Word frequency: based on 450 million word COCA corpus
http://www.wordfrequency.info/spanish.asp
From the Corpus del Español. 100,000 word list. 5,000-60,000 lemma lists. Free list (5,000). In addition to frequency lists for English, we also have what we believe are the most accurate frequency lists for Spanish. Containing the top 20,000 lemmas / words in the language. The Spanish data is based on the 20 million words from the 1900s in the 100 million word Corpus del Español. The top 20,000 words (grouped by lemma, so salir. Salgo, salimos, salieran. 20,000 lemma list. 1 Download and fill out the li...
wordfrequency.info
Word frequency: based on 450 million word COCA corpus
http://www.wordfrequency.info/sample.asp
Corpus of Contemporary American English. 100,000 word list. 5,000-60,000 lemma lists. Free list (5,000). There are a number of different formats available for the 5,000-60,000 word list, as shown below. You can also see samples for the 100,000 word list. Lemma, rank, part of speech, dispersion score. Text or Excel file: can be printed / copied. List sizes: 5,000, 20,000, 60,000. 6,000 entries: every tenth word 1-60,000). 2 Wordlist genre frequency. Excel file; can be printed / copied. 200-300 collocates ...
collocates.info
Collocates: based on 450 million word COCA corpus
http://www.collocates.info/comparison.asp
Corpus of Contemporary American English. There are relatively few collocates dictionaries or lists, other than what we have here. Some sites advertise collocates dictionaries. But they are far too small to include many collocates for many words (e.g. try smolder, adamantly, commemorative, boundless. None of which is found there, but all of which are found. With many collocates) in our data - with up to 200 collocates per word). With the Oxford Collocations Dictionary. Most importantly, with our lists you...
collocates.info
Collocates: based on 450 million word COCA corpus
http://www.collocates.info/uses.asp
Corpus of Contemporary American English. Collocates provide information on word meaning and usage, following the idea that you can tell a lot about a word by the words that it hangs out with. Let's look at two quick examples. (Note that the collocates below are grouped by part of speech and then sorted by frequency.). Dark, eyes, look, silence, presence, sky, sense, cloud, thought, mood, portrait, bird misc. Dark, over, sit, silent, heavy, gray, stare, handsome, mysterious, beneath, moody.
collocates.info
Collocates: based on 450 million word COCA corpus
http://www.collocates.info/samples.asp
Corpus of Contemporary American English. Every hundredth word 1-60,000):.
wordandphrase.info
Words and phrases: frequency, genres, collocates, concordances, synonyms, and WordNet
http://www.wordandphrase.info/x.asp
WORD AND PHRASE .INFO. SAMPLE FREQUENCY RANGE FROM TOP 60,000 WORDS IN COCA. SAMPLE FROM 170,000 TEXTS IN COCA. Wikipedia Corpus: find vocabulary on thousands of different topics.
ngrams.info
N-grams: based on 520 million word COCA corpus
http://www.ngrams.info/portuguese.asp
From the Corpus do Português. In addition to the COCA. Based n-grams of English, we also have n-grams for Portuguese, based on the 20 million words of texts from the 1900s in the 45 million word Corpus do Português. Although the Spanish and Portuguese n-grams are based on much smaller corpora than COCA and COHA, they are still the only n-grams that we are aware of that are based on large, genre-balanced corpora. The following are the approximate number of n-grams:. We will send you a short one-page NDA (...