Last time I have found another useful function that could save your time when playing with R. Let’s say we have some tabular data on HTML page, like here. With readHTMLTable function from XML package reading it into R cannot be simpler:
> install.packages("XML") > require(XML) > > d <- readHTMLTable("http://en.wikipedia.org/wiki/Transistor_count") > class(d) [1] "list"
Each table was extracted and converted into data.frame. Let’s have a look on the first one:
> table_a <- d[[1]] > head(table_a) Processor Transistor count Date of introduction Manufacturer Process Area 1 Intel 4004 2,300 1971 Intel 10 µm 12 mm² 2 Intel 8008 3,500 1972 Intel 10 µm 14 mm² 3 MOS Technology 6502 3,510[1] 1975 MOS Technology 8 μm 21 mm² 4 Motorola 6800 4,100 1974 Motorola 6 μm 16 mm² 5 Intel 8080 4,500 1974 Intel 6 μm 20 mm² 6 RCA 1802 5,000 1974 RCA 5 μm 27 mm²
The header cells from the table were used as column names.
Other read functions
It’s worth mentioning that other read function (like read.table or read.csv) can access documents hosted on a server — there is no need to download it first. I wish I learnt it sooner.