## Graphics of Large Datasets: Visualizing a MillionSpringer Science & Business Media, 12. 6. 2007 - 271 strán (strany) Graphics are great for exploring data, but how can they be used for looking at the large datasets that are commonplace to-day? This book shows how to look at ways of visualizing large datasets, whether large in numbers of cases or large in numbers of variables or large in both. Data visualization is useful for data cleaning, exploring data, identifying trends and clusters, spotting local patterns, evaluating modeling output, and presenting results. It is essential for exploratory data analysis and data mining. Data analysts, statisticians, computer scientists-indeed anyone who has to explore a large dataset of their own-should benefit from reading this book. New approaches to graphics are needed to visualize the information in large datasets and most of the innovations described in this book are developments of standard graphics. There are considerable advantages in extending displays which are well-known and well-tried, both in understanding how best to make use of them in your work and in presenting results to others. It should also make the book readily accessible for readers who already have a little experience of drawing statistical graphics. All ideas are illustrated with displays from analyses of real datasets and the authors emphasize the importance of interpreting displays effectively. Graphics should be drawn to convey information and the book includes many insightful examples. From the reviews: "Anyone interested in modern techniques for visualizing data will be well rewarded by reading this book. There is a wealth of important plotting types and techniques." Paul Murrell for the Journal of Statistical Software, December 2006 "This fascinating book looks at the question of visualizing large datasets from many different perspectives. Different authors are responsible for different chapters and this approach works well in giving the reader alternative viewpoints of the same problem. Interestingly the authors have cleverly chosen a definition of 'large dataset'. Essentially they focus on datasets with the order of a million cases. As the authors point out there are now many examples of much larger datasets but by limiting to ones that can be loaded in their entirety in standard statistical software they end up with a book that has great utility to the practitioner rather than just the theorist. Another very attractive feature of the book is the many colour plates, showing clearly what can now routinely be seen on the computer screen. The interactive nature of data analysis with large datasets is hard to reproduce in a book but the authors make an excellent attempt to do just this." P. Marriott for the Short Book Reviews of the ISI |

### Čo hovoria ostatní - Napísať recenziu

### Obsah

1 | |

12 Data Visualization | 4 |

13 Research Literature | 7 |

14 How Large Is a Large Dataset? | 9 |

15 The Effects of Largeness | 17 |

151 Storage | 18 |

152 Quality | 19 |

153 Complexity | 20 |

611 Type of Data | 126 |

612 Visual Methods for Continuous Variables | 127 |

613 Scaling Up Multiple Views for Larger Datasets | 128 |

622 Reducing the Number of Cases | 129 |

623 Density Estimation | 131 |

624 Screen Real Estate Indexing | 134 |

63 Software System | 135 |

64 Application | 137 |

155 Analyses | 21 |

157 Graphical Formats | 22 |

17 Software | 23 |

18 What Is on the Website | 24 |

183 Datasets | 25 |

19 Contributing Authors | 26 |

Basics | 31 |

221 Barcharts and Spineplots for Univariate Categorical Data | 32 |

222 Mosaic Plots for Multidimensional Categorical Data | 33 |

23 Plots for Continuous Data | 36 |

232 Scatterplots Parallel Coordinates and the Grand Tour | 39 |

24 Data on Mixed Scales | 44 |

25 Maps | 47 |

26 Contour Plots and Image Maps | 49 |

27 Time Series Plots | 50 |

28 Structure Plots | 51 |

Scaling Up Graphics Martin Theus | 55 |

33 Area Plots | 56 |

331 Histograms | 57 |

332 Barcharts | 58 |

333 Mosaic Plots | 60 |

34 Point Plots | 62 |

342 Scatterplots | 63 |

343 Parallel Coordinates | 65 |

35 From Areas to Points and Back | 67 |

351 αBlending and Tonal Highlighting | 69 |

36 Modifying Plots | 71 |

37 Summary | 72 |

Interacting with Graphics Antony Unwin | 73 |

42 Interaction | 74 |

43 Interaction and Data Displays | 75 |

432 Selection and Linking | 77 |

433 Selection Sequences | 78 |

434 Varying Plot Characteristics | 82 |

435 Interfaces and Interaction | 84 |

436 Degrees of Linking | 86 |

437 Warnings and Redmarking | 87 |

44 Interaction and Large Datasets | 88 |

442 Selection Linking and Highlighting | 89 |

443 Varying Plot Characteristics for Large Datasets Zooming | 92 |

45 New Interactive Tasks | 98 |

452 Aggregation and Recoding | 99 |

455 Managing Screen Layout | 101 |

Applications | 105 |

521 Weighted Displays and Weights in Datasets | 107 |

531 Sorting and Reordering | 110 |

532 Grouping Averaging and Zooming | 111 |

54 Mosaic Plots | 113 |

541 Combinatorics of Mosaic Plots | 114 |

542 Cases per Pixel and Pixels per Case | 116 |

544 Grayshading | 119 |

545 Rescaling Binsizes | 122 |

546 Rankings | 123 |

Rotating Plots Dianne Cook and Leslie Miller | 125 |

643 Scatterplot Matrix | 138 |

65 Current and Future Developments | 140 |

652 Software | 141 |

Multivariate Continuous Data Parallel Coordinates Rida Moustafa and Ed Wegman | 142 |

72 Interpolations and Inner Products | 144 |

73 Generalized Parallel Coordinate Geometry | 145 |

74 A New Family of Smooth Plots | 149 |

75 Examples | 150 |

Dealing with Massive Datasets | 152 |

76 Detecting SecondOrder Structures | 154 |

77 Summary | 155 |

Networks Graham Wills Graham Wills | 157 |

82 Layout Algorithms | 158 |

821 Simple Tree Layout | 159 |

822 Force Layout Methods | 161 |

823 Individual Node Movement Algorithms | 162 |

831 Speed Considerations | 164 |

832 Interaction and Layout | 165 |

84 NicheWorks | 166 |

International Calling Fraud | 167 |

86 Languages for Description and Layouts | 172 |

862 Graph Specification via VizML | 173 |

87 Summary | 174 |

Trees Simon Urbanek | 176 |

92 Growing Trees for Large Datasets | 178 |

921 Scalability of the CART Growing Algorithm | 179 |

922 Scalability of Pruning Methods | 181 |

923 Statistical Tests and Large Datasets | 183 |

924 Using Trees for Large Datasets in Practice | 184 |

93 Visualization of Large Trees | 187 |

932 Sectioned Scatterplots | 192 |

933 Recursive Plots | 195 |

94 Forests for Large Datasets | 198 |

95 Summary | 202 |

Transactions Barbara GonzaleaArevalo Felix HernandezCampos Steve Marron and Cheolwoo Park | 203 |

102 Mice and Elephant Plots and Random Sampling | 205 |

103 Biased Sampling | 210 |

1031 Windowed Biased Sampling | 211 |

1032 BoxCox Biased Sampling | 213 |

104 Quantile Window Sampling | 215 |

105 Commonality of Flow Rates | 221 |

Graphics of a Large Dataset Antony Unwin and Martin Theus | 227 |

Data Visualization for Large Datasets | 228 |

113 Visualizing the InfoVis 2005 Contest Dataset | 229 |

1132 Variables | 230 |

1134 Multivariate Displays | 235 |

1135 Grouping and Selection | 239 |

1136 Special Features | 242 |

1137 Presenting Results | 247 |

1138 Summary | 249 |

251 | |

262 | |

267 | |

### Iné vydania - Zobraziť všetky

Graphics of Large Datasets: Visualizing a Million Antony Unwin,Martin Theus,Heike Hofmann Obmedzený náhľad - 2006 |

### Časté výrazy a frázy

### Populárne pasáže

### Odkazy na túto knihu

Interactive and Dynamic Graphics for Data Analysis: With R and GGobi Dianne Cook,Deborah F. Swayne Obmedzený náhľad - 2007 |