1862 words - 7 pages

What do we really know about the real-estate in Orange County?Enrique GonzalezMaria Daniela VetencourtMarch 5th, 2014IntroductionIn this project we are going to analyze Orange County in depth. This analysis would be of help to anybody who would like to invest or live in a property in Orange County, Florida. We will explain the different important variables that have to be taken into consideration when buying a property and also answer various common questions in a statistical manner. The variables we will analyze are:UC (Use code): 1-single house homes and 4: condos.Sales Price: The price at which these properties were sold.Sales Year: The year these properties were sold.Sales Month: The month of the year the property was sold ranging from 1 (January 2011) to 25 (January 2013)Effective Year Built: The year the property was built.Total Living Area: The size of the property.Taking into account these variables we were able to answer the questions mentioned above. Among these questions we have:What is the trend of the sale price depending on the year it was sold?In what month were properties the most expensive?How does the year the property was built affect the price?What type of housing was most bought between 2011-2012?How does the size of the property affect the price?What is the trend in the size of the properties depending on the year it was built?UNIVARIATE ANALYSISDescriptive Statistics:UC (Nominal)

UC

Count

Percentage

1

14069

89.74%

4

1609

10.26%

TOTAL

15678

100%

In the graph above, we can determine the distribution of the UC data. The red area represents the condos while the blue area represents the single family residences. The blue area is greater than the red area which shows that there are more single family residences than condos. There is a total of 14069 single family residences with 89.74% frequency. In contrast, there is a count of 1609 condo type properties with a frequency that is 10.26% of the total. This analysis will be important for future study.Sales Price (Discrete)

Variable

Mean

St.Dev

Min

Q1

Median

Q3

Max

Range

IQR

Skewness

Kurtosis

SALE_PRC

227522

139238

100000

138000

185000

262000

1000000

900000

124000

2.36

6.70

The distribution of the data in this boxplot shows that half of the total sale price data has a value above the median. A boxplot is a good way to show the distribution between the quartiles. The Kurtosis of 6.70 shows the peakness of the graph. In the histogram above the data has a positive skewness of 2.36. In positively skewed data, the median is smaller than the mean (227522>185000). There is a greater amount of data between the first quartile (Q1) and Median. The first quartile is 138000 and the median is 185000. The highest point of the graph is approximately at 150000 which is in between the first quartile and the median which shows the frequency of these values.Sales Year (Ordinal)

Sales Year

Count

Percentage

2011

4577

...

0%

Get inspired and start your paper now!