R_관련 정리파일
R_스크래핑 scraping[영화 평점, 게시판, 논문 초록]
내생에달나라까지
2021. 1. 31. 09:06
R_고려대_Scraping_강의정리.Rmd
0.02MB
고려대 스크램블 강의내용 정리
myinno
2021 1 30
개요
- 고려대 강의 정리(youtube) - 강필성 교수
- 2020년도 2학기 고려대학교 산업경영공학부 데이터분석을 위한 프로그래밍 언어 (R)
- [Korea University] Programming Language for Data Analytics (Undergraduate) KoreaUniv DSBA _ 06-1 ~ 06-4)
- 06-1: Web Scraping - Backgrounds
- 8 시간 강의를 들으면 R로 Scraping 전문가
- 익숙하지 않아서 ==> 하루만 투자 하자
- 해당 강의 강추(학부 1학년 강의로 알고 있음,.)
- 대부분은 강의자료 이며, 강의중 설명한 부분을 Comment에 추가한 수준
- 4편의 강의로 구성
- 1편은 XML의 구조를 R에서 처리하는 방법: 이전에 Java에서 XML Util로 구현하던 부분과 동일하네요…[2021/1/29]
- 2편: 논문이 내용을 스크렘블한 예제 [2021/1/30]
- 3편: 영화 평점(IMDB)
- 4편: 한글 게시판
- 강의 관련
- 브라우저는 chrome을 권장함 [크롬의 개발도구에 대한 이해 - ‘F12’]
- web 화면의 이해 [소스 보기]
- 강의를 위해서는 인터냇 연결이 되어야 함
Part 1: XPath with XML
- Xpath: syntax for XML document For more information, visit w3school
- XPATH 문서 구조 이해 필요
- XML노드의 특징을 이해하고 필요한 부분만 선택할 수 있어야 웹에서 필요한 부분만 가져올 수 있다
if (!requireNamespace("XML") )
install.packages("XML")
library("XML")
# XML/HTML parsing
obamaurl <- "http://www.obamaspeeches.com/"
obamaroot <- htmlParse(obamaurl) # 실제로 웹 접속
#obamaroot
# 필요한 내용만 정리하는 작업 수행 필요
# 불필요한 부분 제거하는 방법 ..--> 표현법을 알아야 함
# Xpath example
xmlfile <- "scraping/xml_example.xml"
tmpxml <- xmlParse(xmlfile)
root <- xmlRoot(tmpxml)
root
<bookstore>
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
# Select children node
XML::xmlChildren(root)[[1]] #대괄호 2개 --> 리스트로 취급된다
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
#결과와 연동하여 생각해 보자
XML::xmlChildren(xmlChildren(root)[[1]])[[1]]
<title lang="en">Everyday Italian</title>
XML::xmlChildren(xmlChildren(root)[[1]])[[2]]
<author>Giada De Laurentiis</author>
XML::xmlChildren(xmlChildren(root)[[1]])[[3]]
<year>2005</year>
XML::xmlChildren(xmlChildren(root)[[1]])[[4]]
<price>30.00</price>
# Selecting nodes
# '/'로 시작하면 Root 부터,
XML::xpathSApply(root, "/bookstore/book[1]")
[[1]]
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
XML::xpathSApply(root, "/bookstore/book[last()]")
[[1]]
<book category="web">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
XML::xpathSApply(root, "/bookstore/book[last()-1]")
[[1]]
<book category="web">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>
XML::xpathSApply(root, "/bookstore/book[position()<3]")
[[1]]
<book category="cooking">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
[[2]]
<book category="children">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
# Selecting attributes
# '//' : Root 이하에서 category가 있으면 모두
XML::xpathSApply(root, "//@category")
category category category category
"cooking" "children" "web" "web"
XML::xpathSApply(root, "//@lang")
lang lang lang lang
"en" "en" "en" "en"
XML::xpathSApply(root, "//book/title", xmlGetAttr, 'lang')
[1] "en" "en" "en" "en"
# Selecting atomic values
xpathSApply(root, "//title", xmlValue) # 모든 title의 value값값
[1] "Everyday Italian" "Harry Potter" "XQuery Kick Start"
[4] "Learning XML"
xpathSApply(root, "//title[@lang='en']", xmlValue)
[1] "Everyday Italian" "Harry Potter" "XQuery Kick Start"
[4] "Learning XML"
xpathSApply(root, "//book[@category='web']/price", xmlValue)
[1] "49.99" "39.95"
xpathSApply(root, "//book[price > 35]/title", xmlValue) # 35이상인 책의 title
[1] "XQuery Kick Start" "Learning XML"
xpathSApply(root, "//book[@category = 'web' and price > 40]/price", xmlValue) # 조건2개
[1] "49.99"
Part 2: 연구논문 (arXiv Papers)
- 참고 : arXiv Paper : 연구 논문을 공유하는 사이트
- Web Site : arXiv
- “text mining”논문을 조회하여 제목, 초록, .. 기타 정보를 스크램블 해보자
- URL 정보는 Page별로 마지막 start=?? 이 부분만 변경됨
- chrome 에서 소스 보기
- ‘F12’ ==> 개발자 도구
- HTML 페이지에서 원하는 위치에서 마우스 오른쪽 ‘검사’ 선택하면 해당 되는 소스로 이동
- ‘toggle device toob bar’: ctrl + shift + m
if (!requireNamespace("dplyr")) install.packages("dplyr")
if (!requireNamespace("stringr")) install.packages("stringr")
if (!requireNamespace("httr")) install.packages("httr")
if (!requireNamespace("rvest")) install.packages("rvest")
library(dplyr)
library(stringr) #string을 처리해주는 패키지
library(httr) # web 접속
library(rvest) #scraping을 지원하는 패키지
#"text minig"관련 처음 URL
url <- 'https://arxiv.org/search/?query=%22text+mining%22&searchtype=all&source=header&start=0'
#URL 구조를 알기 위하여
httr::parse_url(url) #실제로 변경되어야 하는 값은 : $query$start
$scheme
[1] "https"
$hostname
[1] "arxiv.org"
$port
NULL
$path
[1] "search/"
$query
$query$query
[1] "\"text+mining\""
$query$searchtype
[1] "all"
$query$source
[1] "header"
$query$start
[1] "0"
$params
NULL
$fragment
NULL
$username
NULL
$password
NULL
attr(,"class")
[1] "url"
start <- proc.time()
title <- NULL
author <- NULL
subject <- NULL
abstract <- NULL
meta <- NULL
#pages <- seq(from = 0, to = 430, by = 50)
pages <- seq(from = 0, to = 100, by = 50) #일 부분만 수행하자
# 아래 문을 수행하는 시간은 약 15분 정도 소요됨
for( i in pages){
tmp_url <- httr::modify_url(url, query = list(start = i)) #url에서 start 부분만 수정 해라
#p.list-title.is-inline-block ==> 크롬에서 해당 패이지를 개발자 도구(F12)를 통하여 확인
# <p class="list-title is-inline-block"><a href="https://arxiv.org/abs/2101.12177">arXiv:2101.12177</a>
# <span> [<a href="https://arxiv.org/pdf/2101.12177">pdf</a>] </span>
# </p>
#
## temp list에는 페이지의 링크을 가져온다
read_html(tmp_url) %>%
html_nodes('p.list-title.is-inline-block') %>% # 2번째 줄이 리턴 (중간의 점은 space을 표시, [title is])
html_nodes('a[href^="https://arxiv.org/abs"]') %>% # '['는 속성을 표시함
html_attr('href') -> tmp_list # URL만 50개 추출됨
# 각 링크의 세부 URL을 접속하여 데이터를 가져온다 [ 세부 정보: 각 논문의 정보를 Read]
# for(j in 1:length(tmp_list)){
for(j in 1:5){ # 일부만 수행행
tmp_paragraph <- read_html(tmp_list[j])
# title ==>
#<h1 class="title mathjax"><span class="descriptor">Title:</span>Conjoined Dirichlet Process</h1>
tmp_title <- tmp_paragraph %>% html_nodes('h1.title.mathjax') %>% html_text(T)
tmp_title <- gsub('Title:', '', tmp_title) #[1] "Conjoined Dirichlet Process"
title <- c(title, tmp_title)
# author
# <div class="authors"><span class="descriptor">Authors:</span>
# <a href="https://arxiv.org/search/stat?searchtype=author&query=Ngo%2C+M+N">Michelle N. Ngo</a>,
# <a href="https://arxiv.org/search/stat?searchtype=author&query=Pluta%2C+D+S">Dustin S. Pluta</a>,
# <a href="https://arxiv.org/search/stat?searchtype=author&query=Ngo%2C+A+N">Alexander N. Ngo</a>,
# <a href="https://arxiv.org/search/stat?searchtype=author&query=Shahbaba%2C+B">Babak Shahbaba</a>
#</div>
tmp_author <- tmp_paragraph %>% html_nodes('div.authors') %>% html_text #"Authors:Michelle N. Ngo, Dustin S. Pluta, Alexander N. Ngo, Babak Shahbaba"
# html_text ==> Text만 가져온다
tmp_author <- base::gsub('\\s+',' ',tmp_author) #space여러개를 하나로 통합, tab도 space로 변경해라
tmp_author <- base::gsub('Authors:','',tmp_author) %>% str_trim # str_trim: 안뒤 공백 제거
author <- c(author, tmp_author)
# subject[주제]
# <td class="tablecell subjects">
# <span class="primary-subject">Machine Learning (stat.ML)</span>; Machine Learning (cs.LG); Methodology (stat.ME)
#</td>
tmp_subject <- tmp_paragraph %>% html_nodes('span.primary-subject') %>% html_text(T) #"Machine Learning (stat.ML)"
subject <- c(subject, tmp_subject)
# abstract : 초록
tmp_abstract <- tmp_paragraph %>% html_nodes('blockquote.abstract.mathjax') %>% html_text(T)
tmp_abstract <- gsub('\\s+',' ',tmp_abstract) # \n 도 space로 변경됨
tmp_abstract <- sub('Abstract:','',tmp_abstract) %>% str_trim
abstract <- c(abstract, tmp_abstract)
# meta
tmp_meta <- tmp_paragraph %>% html_nodes('div.submission-history') %>% html_text
#gsub('\\s+', ' ',tmp_meta)
gsub('\\s+', ' ',tmp_meta) %>%
strsplit('[v1]', fixed = T) %>%
lapply('[',2) %>% # 리스트에서 2번째 element을 [M[1, 2] == `[`(M, 1, 2) ]
unlist %>%
str_trim -> tmp_meta
# tmp_meta <- lapply(strsplit(gsub('\\s+', ' ',tmp_meta), '[v1]', fixed = T),'[',2) %>% unlist %>% str_trim
meta <- c(meta, tmp_meta)
# cat(j, "paper\n")
Sys.sleep(1) # 너무 빨리 하면 서버에서 체크을 대비하여 1s Sleep
}
# cat((i/50) + 1,'/ 9 page\n')
}
papers <- data.frame(title, author, subject, abstract, meta)
end <- proc.time()
end - start # Total Elapsed Time
user system elapsed
0.98 0.05 37.73
# Export the result
save(papers, file = "Arxiv_Text_Mining.RData")
write.csv(papers, file = "scraping/Arxiv papers on Text Mining.csv")
### 결과파일을 꼭 Excel로 열어 보세요.
Part 3: 영화평점 (IMDB Top 50 Movies)
- 영화 리뷰 데이터 수집
- IMDB에서 영화 scraping…
- 영화제목, 연도 평균 평점, 작가,. Review ….
library(dplyr)
library(stringr)
library(httr)
library(rvest)
url <- 'https://www.imdb.com/search/title/?groups=top_250&sort=user_rating'
start <- proc.time()
imdb_top_50 <- data.frame() #초기화
cnt <- 1 #수집하고 있는 영화 순번
#<h3 class="lister-item-header">
# <span class="lister-item-index unbold text-primary">1.</span>
# <a href="/title/tt0111161/?ref_=adv_li_tt">The Shawshank Redemption</a>
# <span class="lister-item-year text-muted unbold">(1994)</span>
#</h3>
tmp_list <- read_html(url) %>% html_nodes('h3.lister-item-header') %>%
html_nodes('a[href^="/title"]') %>% html_attr('href')
for( i in 1:50){
tmp_url <- paste('http://imdb.com', tmp_list[i], sep="") #"http://imdb.com/title/tt0111161/?ref_=adv_li_tt"
tmp_content <- read_html(tmp_url)
# Extract title and year
#<div class="title_wrapper">
# <h1 class="">The Shawshank Redemption <span id="titleYear">
# (<a href="/year/1994/?ref_=tt_ov_inf">1994</a>)</span> </h1>
# <div class="subtext">15
# ...
title_year <- tmp_content %>% html_nodes('div.title_wrapper > h1') %>% html_text %>% str_trim
# 'div.title_wrapper > h1' ==> "div.title_wrapper"을 찾고, 하위의 'h1'을 찾아라
tmp_title <- substr(title_year, 1, nchar(title_year)-7)
tmp_year <- substr(title_year, nchar(title_year)-4, nchar(title_year)-1)
tmp_year <- as.numeric(tmp_year)
# Average rating
#<div class="ratingValue">
#<strong title="9.3 based on 2,342,117 user ratings"><span itemprop="ratingValue">9.3</span></strong><span #class="grey">/</span><span class="grey" itemprop="bestRating">10</span> </div>
tmp_rating <- tmp_content %>% html_nodes('div.ratingValue > strong > span') %>% html_text
tmp_rating <- as.numeric(tmp_rating)
# Rating counts
#<span class="small" itemprop="ratingCount">2,342,117</span>
tmp_count <- tmp_content %>% html_nodes('span.small') %>% html_text
tmp_count <- gsub(",", "", tmp_count)
tmp_count <- as.numeric(tmp_count)
# Summary
tmp_summary <- tmp_content %>% html_nodes('div.summary_text') %>% html_text %>% str_trim
# Director, Writers, and Stars
# 3개의 값이 Tag로는 구분되지 않고, Text 값으로 구분됨
# <div class="plot_summary ">
# ...
# <div class="credit_summary_item">
# <h4 class="inline">Director:</h4>
# <a href="/name/nm0001104/?ref_=tt_ov_dr">Frank Darabont</a> </div>
# <div class="credit_summary_item">
# <h4 class="inline">Writers:</h4>
# <a href="/name/nm0000175/?ref_=tt_ov_wr">Stephen King</a> (short story "Rita Hayworth and Shawshank Redemption"), <a href="/name/nm0001104/?ref_=tt_ov_wr">Frank Darabont</a> (screenplay) </div>
# <div class="credit_summary_item">
# <h4 class="inline">Stars:</h4>
tmp_dws <- tmp_content %>% html_nodes('div.credit_summary_item') %>% html_text
tmp_director <- tmp_dws[1] %>% str_trim
tmp_director <- sub("Director:\n", "", tmp_director)
tmp_writer <- tmp_dws[2] %>% str_trim
tmp_writer <- sub("Writers:\n", "", tmp_writer)
tmp_stars <- tmp_dws[3] %>% str_trim
tmp_stars <- strsplit(tmp_stars, "\nSee")[[1]][1]
tmp_stars <- sub("Stars:\n", "", tmp_stars)
tmp_stars <- substr(tmp_stars, 1, nchar(tmp_stars)-1) %>% str_trim
# 리뷰 확인
#리뷰의 URL 규칙을 확인
# Extract the first 25 reviews
title_id <- strsplit(tmp_list[i], "/")[[1]][3]
review_url <- paste("https://www.imdb.com/title/", title_id, "/reviews?ref_=tt_urv", sep="")
tmp_review <- read_html(review_url) %>% html_nodes('div.review-container')
for(j in 1:25){
# cat("Scraping the", j, "-th review of the", i, "-th movie. \n")
tryCatch({ #tryCatch({}, error = function(e){print("...")})
# Review rating
tmp_info <- tmp_review[j] %>% html_nodes('span.rating-other-user-rating > span') %>% html_text
tmp_review_rating <- as.numeric(tmp_info[1]) # 앞에 숫자만 필요
# Review title
tmp_review_title <- tmp_review[j] %>% html_nodes('a.title') %>% html_text
tmp_review_title <- tmp_review_title %>% str_trim
# Review text
tmp_review_text <- tmp_review[j] %>% html_nodes('div.text.show-more__control') %>% html_text
tmp_review_text <- gsub("\\s+", " ", tmp_review_text)
tmp_review_text <- gsub("\"", "", tmp_review_text) %>% str_trim
# Store the results
imdb_top_50[cnt,1] <- tmp_title
imdb_top_50[cnt,2] <- tmp_year
imdb_top_50[cnt,3] <- tmp_rating
imdb_top_50[cnt,4] <- tmp_count
imdb_top_50[cnt,5] <- tmp_summary
imdb_top_50[cnt,6] <- tmp_director
imdb_top_50[cnt,7] <- tmp_writer
imdb_top_50[cnt,8] <- tmp_stars
imdb_top_50[cnt,9] <- tmp_review_rating
imdb_top_50[cnt,10] <- tmp_review_title
imdb_top_50[cnt,11] <- tmp_review_text
cnt <- cnt+1
}, error = function(e){print("An error occurs, skip the review")})
}
Sys.sleep(1) # Pretending not a bot
}
[1] "An error occurs, skip the review"
[1] "An error occurs, skip the review"
names(imdb_top_50) <- c("Title", "Year", "Avg.Rating", "RatingCounts", "Summary", "Director",
"Writer", "Stars", "Review.Rating", "Review.Title", "Review.Text")
end <- proc.time()
end - start # Total Elapsed Time
user system elapsed
27.21 1.29 288.26
# Export the result
#save(imdb_top_50, file = "imdb_top_50.RData")
write.csv(imdb_top_50 , file = "scraping/imdb_top_50.csv")
Part 4: 한글Page:ppomppu
- 한글 페이지 스크래핑 [www.ppomppu.co.kr]
- ‘보험포럼’ 10Page ..
- 처리 순서
- Page URL structure 알아내기 (각 Page에서 변하는 부분을 찾는다)
- 한글관련 encoding…
library(dplyr)
library(stringr)
library(httr)
library(rvest)
url <- 'http://www.ppomppu.co.kr/zboard/zboard.php?id=insurance&page='
start <- proc.time() #
ppomppu_insurance <- data.frame()
Npost <- 1
# Extract the link of each post (for first 10 pages)
for( i in c(1:10)){ # Page
tryCatch({
tmp_url <- paste(url, i, '&divpage=13', sep="")
# list0,1 으로 2개로 분리되어 있음
tmp_list0 <- read_html(tmp_url) %>% html_nodes('tr.list0') %>% html_nodes('a') %>% html_attr('href')
tmp_list1 <- read_html(tmp_url) %>% html_nodes('tr.list1') %>% html_nodes('a') %>% html_attr('href')
tmp_list0 <- paste0('http://www.ppomppu.co.kr/zboard/',tmp_list0)
tmp_list1 <- paste0('http://www.ppomppu.co.kr/zboard/',tmp_list1)
tmp_list <- c(tmp_list0, tmp_list1)
for(j in 1:length(tmp_list)){ # 한건의 문의 내용
# cat("Processing ", j, "-th Post of ", i, "-th page \n", sep="")
tryCatch({
tmp_paragraph <- read_html(tmp_list[j])
# title
#<font class="view_title2"><!--DCM_TITLE-->보험 설계 봐주세요.<!--/DCM_TITLE--></font>
tryCatch({
tmp_title <- rvest::repair_encoding(tmp_paragraph %>% html_nodes('font.view_title2') %>% html_text(T))
}, error = function(e){tmp_title <- NULL})
# date
tryCatch({
tmp_date <- repair_encoding(tmp_paragraph %>% html_nodes('td.han') %>% html_text(T))[2]
date_start_idx <- gregexpr(pattern ='등록일', tmp_date)[[1]][1]
tmp_date <- substr(tmp_date, date_start_idx+5, date_start_idx+20)
}, error = function(e){tmp_date <- NULL})
# contents
tryCatch({
tmp_contents <- repair_encoding(tmp_paragraph %>% html_nodes('td.board-contents') %>% html_text(T))
tmp_contents <- gsub("[[:punct:]]", " ", tmp_contents) #문장기호 !.?
tmp_contents <- gsub("[[:space:]]", " ", tmp_contents) #Space는 한칸 space
tmp_contents <- gsub("\\s+", " ", tmp_contents) #
tmp_contents <- stringr::str_trim(tmp_contents, side = "both") # 양쪽 trim
}, error = function(e){tmp_contents <- NULL})
## 답볍 내용을 추가해 보자-----
tmp_comments <- tmp_paragraph %>% html_nodes('div.comment_wrapper')
df_comment <- data.frame()
for(k in 1:lengths(tmp_comments)[[1]]) {
# cat("Comment ... " , k , "Scraping the", j, "-th review of the", i, "-th movie. \n")
temp_str <- tmp_comments[k] %>% html_nodes('div.over_hide.link-point') %>% html_text
base::gsub("[[:space:]]", " ", temp_str) %>%
base::gsub("\\s+", " ", .) %>%
stringr::str_trim(., side = "both") -> temp_str
# cat(' --> ' , temp_str, '\n')
df_comment[k,1] <- temp_str
}
if (length(df_comment) > 0) {
for (k in 1:lengths(df_comment)[[1]]) {
ppomppu_insurance[Npost,1] <- tmp_title
ppomppu_insurance[Npost,2] <- tmp_date
ppomppu_insurance[Npost,3] <- tmp_contents
ppomppu_insurance[Npost,4] <- df_comment[k,1]
Npost <- Npost + 1
}
} else {
ppomppu_insurance[Npost,1] <- tmp_title
ppomppu_insurance[Npost,2] <- tmp_date
ppomppu_insurance[Npost,3] <- tmp_contents
Npost <- Npost + 1
}
}, error = function(e){print("Invalid conversion, skip the post")})
}
}, error = function(e){print("Invalid conversion, skip the page")})
}
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
[1] "Invalid conversion, skip the post"
end <- proc.time()
end - start # Total Elapsed Time
user system elapsed
18.10 1.12 73.76
# Export the result
write.csv(ppomppu_insurance, file = "scraping/ppomppu_insurance.csv")
## 결과파일을 꼭 Excel로 열어 보세요.