Recent Changes - Search:

Home

Is the Laser up?

People

Publications

Calendar

Projects

Spring 2012

Older Courses

Fall 2011

Spring 2011

Fall 2010

Spring 2010

Fall 2009

Spring 2009

Fall 2008

Spring 2008

Fall 2007

HOWTOs

edit SideBar

CorrelationCoefficients

Heather Byrne
December 10, 2009

Overview

This program finds the correlation coefficients between all possible pairs of attributes in a set of attributes. It first reads in a CSV file, then does the calculations, and then prints the information to the screen with no pairs being repeated (in other words is AB is on the list then BA is not.

Screenshot

(You may attach a PNG, GIF, or JPG file. Please note the Attach: syntax for doing this. After you save the wiki page, you will see the Attach: link with a blue triangle. Click on the link, and then you will be brought to a page where you can upload the attachment. After you upload the attachment, the link goes away and you see the image instead.)

Concepts Demonstrated

  • Linear Recursion is used to provide a list of all possible pairs of attributes (actually it's used in almost every function )
  • Compound procedures are used to calculate the correlation coefficient
  • Procedures as black box abstractions were used to calculate the parts of the correlation coefficient
  • Abstractions with higher order procedures are used to calculate some of the parts of the calculations for correlation coefficient
  • Data abstraction was used to represent and manipulate columns
  • Pairs/lists were used in hierarchical data representation
  • mapping was used to manipulate some of the data (sequence operations)

External Technology

This project uses a csv.plt wich is a package that provides comma separated value utilities in scheme. This was used in my project to help me read in and parse the file. Unfortunately I the functions for parsing a file weren't sufficient and I spent most of my time coming up with procedures to help me interpret the file in a more helpful way.

Innovation

This project could possibly be used as an extension to the csv.plt library. It also provides some of the legwork for some planned future work on automatic labeling and intelligent visualization.

Technology Used Block Diagram

Additional Remarks

I ran into several stop signs along the way. Two worth mentioning are get problems with the built in "next-row" and an incorrect formula.

When I was trying to access different parts of the data I was using the method "next-row" which iterates over the rows each time it is called. the problem is that it can only iterate through once and is then left pointing at an empty list. I solved this by using other methods to access a row.

Also I spent hours trying to figure out why my results were incorrect, only to have someone point out that the formula on the website that I was using didn't have parentheses placed in the right spot. Once I found that out, I was able to finally complete the project.

Edit - History - Print - Recent Changes - Search
Page last modified on December 10, 2009, at 06:51 AM