R web scraping function getPageNumber error
I am building a webscraper and trying to understand why my getPage Number Function does not work. The function worked last night and tonight I have been having an error getting the right output
library(rvest)
library(RCurl)
library(XML)
library(stringr)
getPageNumber <- function(URL) {
parsedDocument <- read_html(URL)
results_per_page <- length(parsedDocument %>% html_nodes(".sr-list"))
total_results <- parsedDocument %>%
toString() %>%
str_match(., 'num_results":"(.*?)"') %>%
.[,2] %>%
as.integer()
pageNumber <- tryCatch(ceiling(total_results / results_per_page), error = function(e) {1})
return(pageNumber)
}
getPageNumber("https://academic.oup.com/dnaresearch/search-results?rg_IssuePublicationDate=01%2F01%2F2010%20TO%2012%2F31%2F2010&fl_SiteID=5275&page=")
The output I am getting is NA, when it should be numeric number
from Recent Questions - Stack Overflow https://ift.tt/3u2Csgo
https://ift.tt/eA8V8J
Comments
Post a Comment