10/5/10 Data Classification -Data Classification -set of data/information broken into ?groups? based on certain criteria -example: death rates in 7 classes of rate -maker of the map needs to decide on the symbology, classifications, etc. to determine the data class -divisions between classes can dramatically impact the information and ?story? of a map -Methods for Classifying Data (Classification Schemes) -need to understand the distribution of the data to determine which scheme is appropriate -there are numerous types of schemes to classify data -possible to combine multiple schemes based on the goals/story of the map -Equal Interval Scheme -the interval between each class remains the same -advantages -easy to make -easy to read, understand, and remember when reading the map -disadvantages -ignores the distribution of the data -makes things that are similar look like they are different -makes things that are different look like they are similar -dividing a ?cluster? of data on your distribution map is not always wrong, but must be careful about the impression of dividing similar items into different classes -make sure the reason you are splitting the cluster is important/appropriate -Quantile Scheme -goal is to have the same number of data points in each class -advantages -allows revealing of certain statistical data that may not be otherwise obvious -example: median is clearly defined -disadvantages -does not account for distribution of data set -Arithmetic Scheme -intervals get wider (or narrower) through the distribution -advantages -very good at showing data sets where small difference matter in one part of the data set but not in other parts of the same data set -example: small changes in unemployment can be important from low rates (change from 5 to 10%) vs. a change at the higher end (25% to 30%) -disadvantages -harder to understand and remember the classifications as reading the map -can cause difficulties in making clear associations between similar things/data -Maximum Breaks Scheme -based on clear ?breaks? between the classes -place breaks in the locations with the largest ?gaps? between the data -advantages -never divides a cluster, because it relies on existing gaps -disadvantages -just because something is the largest clusters, doesn?t mean it is the most important data -preservation of clusters may not be the most important criteria in selecting classes -can separate clusters that might be more appropriate together -Optimal Scheme (Natural Breaks Method) -involves a mathematical formula to: -minimize differences within a class -maximize the differences between classes -advantages -best at showing where clusters are and visualizing the difference between clusters -disadvantages -may not always effectively show the observations in an easily understood/intuitive way -may place breaks in places that are appropriate/important to the ?story? of the map -By Eye Scheme -?eyeball? where the lines would be most appropriate -no computer/formula, just intuitive by the maker -advantages -allows the maker of the map to determine the areas that need to be highlighted and most appropriately tell the ?story? -disadvantages -inefficient -highly subjective -multiple people will look at the same data and have different ideas of appropriate breaks -subject to bias -Concerns of Classification -may need to develop a classification scheme that can be fit to multiple maps -same scheme can make it easy to compare changes on maps -same scheme can cause the loss of subtle differences if there are not ?enough? classes -choices of classification must be defensible -do not assume that the classification chosen by the program is the most appropriate -Choosing the Number of Classes -too few classes can cause ?lumping? of data and make details difficult to determine -lose information -appropriate only in limited circumstances -difficult to see subtle patterns/changes over the area of the map -too many classes make it difficult to read the map -too many classes can make it harder to see the subtle differences in classes based on the legend -can make it easier to see subtle patterns/changes over the area covered in the map -typical map uses 4-6 classes -people can only retain 4 or 5 individual items/thoughts in the mind at a time -?Unclassed? maps -removes the need to determine the appropriate ?clustering? and/or scheme for the map -legend of the map becomes a gradation scale with each different piece of data having its own symbol/color that is slightly different than the next bit of data -can create extreme number of classes on a map -difficult to read and get accurate information -excellent at showing minor gradation/changes on the map
Want to see the other 5 page(s) in 100510.doc?JOIN TODAY FOR FREE!