Skip to content

Latest commit

 

History

History
244 lines (197 loc) · 9.77 KB

150512.md

File metadata and controls

244 lines (197 loc) · 9.77 KB
今日の範囲[p.17 How t do it.. 6~

今日のデータirisを準備
> data(iris)
> dim(iris)
[1] 150   5


----p.17: How to do it ...  (6)--------------------------------

* rnorm関数を理解する。正規分布のランダムサンプルを生成。

> r1 <- rnorm(n=30, mean=1, sd=0.1)   
                   平均1、標準偏差0.1の正規分布のサンプルを30個生成。
> hist(r1)    右Plots画面にヒストグラム図表示

Gyazo


> r2 <- rnorm(30,1,0.1)  省略可能
> r2
 [1] 1.1178794 1.0397319 1.1823451 1.0711442 0.8951000 1.0480523
 [7] 1.0575540 0.8427557 1.0371548 0.9453643 1.0241647 0.9978094
[13] 0.9937166 1.1514011 1.1076591 0.9793242 0.8742485 0.7217720
[19] 0.7526474 1.0042627 1.0630212 1.0012110 0.7663690 0.8845721
[25] 0.8831569 0.9861942 1.0378971 1.0402650 0.9569134 1.0752101

* 基本統計量の計算方法を理解する

> mean(r1)    データの平均値
[1] 0.9846299
> sd(r1)    データの標準偏差
[1] 0.1142747
> var(r1)   データの分散
[1] 0.01305872

前回の復習
> summary(r1)  基本統計量の要約
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 0.7218  0.9077  1.0030  0.9846  1.0550  1.1820 

個別に最小値と最大値を求めるには?
> max(r1)    最大値
[1] 1.182345
> min(r1)    最小値
[1] 0.721772
> median(r1)  中央値
[1] 1.002737

---------------------------------------------------------------------------
p.17に戻りStalk.Lengthデータを生成

> Stalk.Length <- c (rnorm(30,1,0.1),rnorm(30,1.3,0.1),rnorm(30,1.5,0.1),rnorm(30,1.8,0.1),rnorm(30,2,0.1))   平均が異なり分散が同じ30x5=150データを生成
> Stalk.Length 生成データを表示確認
  [1] 0.8921421 1.0550530 1.1308808 1.0796513 0.9369771 1.2047349
  [7] 1.0407067 1.1382617 0.8394090 1.0031632 0.9353475 0.9985884
 [13] 1.0261368 0.9268915 1.1213712 0.9048967 0.9542164 0.7478351
 [19] 0.9467987 0.9610430 0.9189688 0.9963100 1.0728138 0.8803209
 [25] 0.8175551 1.0820303 1.0246283 0.9456785 1.0258660 1.0243214
 [31] 1.3204919 1.4286443 1.3333357 1.3111258 1.3336106 1.3329345
 [37] 1.5286558 1.4232648 1.2486023 1.2125026 1.4306571 1.3297766
 [43] 1.2885587 1.1378017 1.1925756 1.2586319 1.1912726 1.3750280
 [49] 1.5523592 1.3202334 1.1852053 1.3818343 1.2386627 1.3716788
 [55] 1.1625793 1.3611949 1.1842638 1.2880419 1.1900505 1.4750548
 [61] 1.4866536 1.3839021 1.4773459 1.6433605 1.3753647 1.5001304
 [67] 1.5762595 1.5509684 1.4047298 1.4473762 1.4541600 1.5092810
 [73] 1.6442589 1.3599851 1.4266139 1.5629679 1.5051306 1.5866319
 [79] 1.5243248 1.2778475 1.5087839 1.6571711 1.5519017 1.4490116
 [85] 1.5304438 1.3881790 1.2822249 1.4548542 1.5734907 1.3877803
 [91] 1.8044793 1.7866928 1.8376624 1.7233526 1.6510363 1.8221745
 [97] 1.6071475 1.8699298 1.5356849 1.8635871 1.6344142 1.9058911
[103] 1.7820783 1.7989895 1.8578040 1.6702627 1.6751173 1.5967798
[109] 1.7116072 1.9686936 1.6109489 1.9218557 2.2085703 1.6851783
[115] 1.7459801 1.7708710 1.7672610 1.8109322 1.6588718 1.6699241
[121] 2.0098012 1.9673040 2.0349714 2.0115345 2.1436555 2.0974409
[127] 2.0470874 1.9747315 2.0510872 1.8614584 1.9767346 1.9725971
[133] 1.9572242 1.8859643 1.9810844 2.1300231 1.9238299 2.0252168
[139] 1.9009679 1.9859873 1.9643309 1.9414266 2.0996803 1.9953504
[145] 2.0346951 1.9975663 2.0405251 2.0290622 2.1631461 1.9990700

> myiris4 <- cbind(iris, Stalk.Length) 最後列として追加
> myiris4
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species Stalk.Length
1            5.1         3.5          1.4         0.2     setosa    1.1377273
2            4.9         3.0          1.4         0.2     setosa    0.8413762
3            4.7         3.2          1.3         0.2     setosa    1.0871570
4            4.6         3.1          1.5         0.2     setosa    1.1239159
5            5.0         3.6          1.4         0.2     setosa    0.9876178
6            5.4         3.9          1.7         0.4     setosa    1.0413773
7            4.6         3.4          1.4         0.3     setosa    0.9635390
8            5.0         3.4          1.5         0.2     setosa    1.1410253
9            4.4         2.9          1.4         0.2     setosa    1.0697473

> dim(iris)
[1] 150   5
> dim(myiris4)
[1] 150   6

=============[p18]==========================================

--------(7)-----------------------------------------------
cbindを使わず1ステップでStalk.Length列を追加

> myiris5 <- iris  準備
> dim(myiris5)
[1] 150   5
> myiris5$Stalk.Length <- c (rnorm(30,1,0.1),rnorm(30,1.3,0.1),rnorm(30,1.5,0.1),rnorm(30,1.8,0.1),rnorm(30,2,0.1)) 

--------(8)-----------------------------------------------

列追加データの確認

> dim(myiris5)   6列目を確認
[1] 150   6

> myiris5   内容表示
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species Stalk.Length
1            5.1         3.5          1.4         0.2     setosa    1.0633843
2            4.9         3.0          1.4         0.2     setosa    1.1781134
3            4.7         3.2          1.3         0.2     setosa    1.0114039
4            4.6         3.1          1.5         0.2     setosa    0.9795390
5            5.0         3.6          1.4         0.2     setosa    0.9872895

> colnames(myiris5) 列名確認
[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"      "Stalk.Length"

-------(9)------------------------------------------------
同様に行追加コマンドrbindを使ってみる

> newdat <- data.frame(Sepal.Length=10.1, Sepal.Width=0.5, Petal.Length=2.5, Petal.Width=0.9, Species="myspecies")
> newdat
  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
1         10.1         0.5          2.5         0.9 myspecies
> myiris6 <- rbind(iris,newdat)  newdat行を追加
> dim(myiris6)
[1] 151   5
> myiris6[151,]     151行目を表示確認
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
151         10.1         0.5          2.5         0.9 myspecies

-------(10)----------------------------------------------

条件一致のデータ抽出

> mynew.iris1 <- subset(myiris6,Sepal.Length == 10.1)
> mynew.iris2 <- myiris6[myiris6$Sepal.Length == 10.1,]
> mynew.iris1
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
151         10.1         0.5          2.5         0.9 myspecies
> mynew.iris2
    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
151         10.1         0.5          2.5         0.9 myspecies


> mynew.iris3 <- subset(iris, Species == "setosa")

-------(11)------------------------------------------------

抽出内容の確認

> mynew.iris3[1,]
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa


> head(mynew.iris3)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

条件文に論理演算子を使う(AND & , OR | , NOT !)
> iris_2species <- subset(iris, (Species == "setosa") | (Species == "versicolor") )
> iris_2species
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1            5.1         3.5          1.4         0.2     setosa
2            4.9         3.0          1.4         0.2     setosa
3            4.7         3.2          1.3         0.2     setosa
4            4.6         3.1          1.5         0.2     setosa
5            5.0         3.6          1.4         0.2     setosa



--[p.19 There's more]--------------------------------------------------------------

%in% 演算子でベクトル条件抽出

> mylength <- c(4:7,7.2)  ベクトルデータ生成
> mylength
[1] 4.0 5.0 6.0 7.0 7.2

> mynew.iris4 <- iris[iris[,1] %in% mylength,]
> mynew.iris4
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
5            5.0         3.6          1.4         0.2     setosa
8            5.0         3.4          1.5         0.2     setosa
26           5.0         3.0          1.6         0.2     setosa
27           5.0         3.4          1.6         0.4     setosa
36           5.0         3.2          1.2         0.2     setosa
41           5.0         3.5          1.3         0.3     setosa
44           5.0         3.5          1.6         0.6     setosa
50           5.0         3.3          1.4         0.2     setosa
51           7.0         3.2          4.7         1.4 versicolor
61           5.0         2.0          3.5         1.0 versicolor
63           6.0         2.2          4.0         1.0 versicolor
79           6.0         2.9          4.5         1.5 versicolor
84           6.0         2.7          5.1         1.6 versicolor
86           6.0         3.4          4.5         1.6 versicolor
94           5.0         2.3          3.3         1.0 versicolor
110          7.2         3.6          6.1         2.5  virginica
120          6.0         2.2          5.0         1.5  virginica
126          7.2         3.2          6.0         1.8  virginica
130          7.2         3.0          5.8         1.6  virginica
139          6.0         3.0          4.8         1.8  virginica

------[p.21]----------------------------------------------------------

データ欠損値NAの取扱い

> a <- c(1:4,NA,6)
> a
[1]  1  2  3  4 NA  6
> mean(a)     平均値を求められない
[1] NA
> mean(a, na.rm=TRUE)
[1] 3.2

---------[p.22]-------------------------------------------------------

> n.data <- rnorm(100,1,0.1)
> hist(n.data)
> plot(density(n.data))
> ?pnorm

> plot(y)
> plot(density(n.data))
> ?qnorm