How to convert special symbols in web scraping with R? -


i learning how scrape web xml , rcurl packages. goes except 1 thing. special characters ö or č read in differently r. instance í read in í. assume latter sort of html coding first.

i have been looking way convert these characters have not found it. sure other people have stumbled upon problem well, , suspect there must sort of function convert these characters. know solution? in advance.

here example of code, sorry did not provide earlier.

library(xml) url <-   'http://en.wikipedia.org/wiki/2000_wimbledon_championships_%e2%80%93_men%27s_singles' tables <- readhtmltable(url) sec <- tables[[6]] pl1r1 <- unlist(strsplit(as.character(sec[,2]), ' '))[seq(2,32, 4)] enc2utf8(pl1r1) # not seem work 

try parsing first while specifying encoding, reading table, here: readhtmltable , utf-8 encoding.

an example might be:

library(xml) url <- "http://en.wikipedia.org/wiki/2000_wimbledon_championships_%e2%80%93_men%27s_singles" doc <- htmlparse(url, encoding = "utf-8") #this preserve characters tables <- as.data.frame(readhtmltable(doc, stringsasfactors = false)) sec <- tables[[6]] #not sure you're trying here though pl1r1 <- unlist(strsplit(as.character(sec[,2]), ' '))[seq(2,32, 4)]  

Comments

Popular posts from this blog

javascript - Count length of each class -

What design pattern is this code in Javascript? -

hadoop - Restrict secondarynamenode to be installed and run on any other node in the cluster -