Skip to content

Instantly share code, notes, and snippets.

@PolMine
PolMine / gist:76bbf7623e60bb19cde9658a8096c1d3
Created December 27, 2019 22:55
Get phrases using pointwise mutual information
library(polmineR)
use("GermaParl")
stopwords <- unname(unlist(
noise(
terms("GERMAPARL", p_attribute = "word"),
stopwordsLanguage = "en"
)
))
@PolMine
PolMine / gist:1683dd9baad585b48609689ff0ef1cf5
Created November 30, 2019 19:34
Download Topicmodel from Amazon S3
Sys.setenv(
"AWS_ACCESS_KEY_ID" = "<my-access-key>",
"AWS_SECRET_ACCESS_KEY" = "<my-secret-access-key>"
)
library(aws.s3)
get_bucket("polmine")
lda <- s3readRDS("corpora/cwb/germaparl/germaparl_lda_speeches_250.rds", bucket = "polmine")
@PolMine
PolMine / gist:a3dd727f3bfec24f0918d7ebe3de8033
Created November 11, 2019 08:58
word2vec workflow with polmineR
# This code, which can be adapted easily, can be used to train a word2vec model easily. Note that it
# relies on the package [wordVectors](https://github.com/bmschmidt/wordVectors).
library(wordVectors)
file_out <- "~/Lab/tmp/germaparl.txt"
vectors_bin <- "~/Lab/tmp/germaparl.bin"
.fn <- function(x){
library(magick)
library(purrr)
list.files(path = "~/Lab/tmp/", pattern = "*.png", full.names = T) %>%
map(image_read) %>% # reads each path file
image_join() %>% # joins image
image_animate(fps = 1) %>% # animates, can opt for number of loops
image_write("~/Lab/annotation_demo.gif") # write to current dir
# install current version of cwbtools
library(drat)
drat::addRepo("polmine")
install.packages("cwbtools")
# Reencode installed corpus
library(polmineR)
@PolMine
PolMine / gist:5fa4a87fdbf89af6f1ac76b695a31f83
Created August 16, 2018 15:51
Get cooccurrence similarity
library(cooccurrences)
library(pbapply)
library(coop)
issues <- df %>% unlist() %>% unname() %>% as.character() %>% unique()
dt <- count("GERMAPARL", issues) %>%
setkeyv("count") %>% setorderv(cols = "count", order = -1L)
issues_min <- dt[count > 100][["query"]]
issues_min <- iconv(issues_min, from = "latin1", to = "UTF-8")
@PolMine
PolMine / sentiws.R
Last active May 30, 2022 12:37
Import SentiWS dictionary for sentiment analysis into R as data.table
# The get_sentiws function will download the zip-file with the SentiWS dictionary,
# unzip it and return a data.table.
library(data.table)
get_sentiws <- function(){
sentiws_tmp_dir <- file.path(tempdir(), "sentiws")
if (!file.exists(sentiws_tmp_dir)) dir.create(sentiws_tmp_dir)
sentiws_zipfile <- file.path(sentiws_tmp_dir, "SentiWS_v1.8c.zip")
@PolMine
PolMine / ec2.Rmd
Last active March 28, 2017 21:42
Install RStudio Server and polmineR on amazon EC2
After creating the Linux virtual machine on amazon EC2, the hard part was to set the Security Groups correctly to have port 8787 open, and to connect the newly defined Security Group to the instance. After that, everything ran smoothly as follows:
## Add CRAN mirror to sources.list, including key
```{sh}
sudo sh -c 'echo "deb http://cran.rstudio.com/bin/linux/ubuntu xenial/„ >> /etc/apt/sources.list'
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
```
@PolMine
PolMine / quick_start_polmineR.Rmd
Last active March 25, 2017 07:04
polmineR: Quick start
The recent stable version of the package (v0.7.2) is available at CRAN:
https://cran.r-project.org/web/packages/polmineR/index.html
This is also where the new vignette / documentation can be looked at:
https://cran.r-project.org/web/packages/polmineR/vignettes/vignette.html
The package can be installed with the conventional package installation mechanism (from R):
```{r}
install.packages("polmineR")
```
@PolMine
PolMine / gettingPackagedCorpora.Rmd
Last active March 9, 2017 17:08
Packaged corpus installation
Installing a packaged corpus from the PolMine repository
--------------------------------------------------------
As an experiment, I have put a corpus of plenary procotols ("PLPRBT") into a private repository I host at the PolMine server. This is how to get it: You will need the devtools package to get the latest development version of polmineR. On Windows, installing devtools may require that you have installed Rtools.
```{r}
install.packages("devtools")
```
Now, install the development version of the polmineR package.