Wednesday, February 27, 2008

Thesis Pre

http://itp.nyu.edu/~vb563/Thesis/Thesis_2.pdf

Ranking URL's + frequency


What Determines Presence?







Using Page Rank + Word hit




1.
http://new.music.yahoo.com/has freq 0
2.
http://www.music.com/has freq 3
3.
http://www.allmusic.com/has freq 1
4.
http://music.aol.com/has freq 8
5.
http://www.amazon.com/music-rock-classical-pop-jazz/b?ie=UTF8&node=5174has freq 1
6.
http://www.myspace.com/index.cfm?fuseaction=musichas freq 0
7.
http://www.mtv.com/has freq 0
8.
http://dir.yahoo.com/Entertainment/Musichas freq 0
9.
http://www.npr.org/music/has freq 0
10.
http://www.billboard.com/has freq



Updated Timeline:




Week 5:
Work on Filtering Techniques:::added word freq to vis
Rough UI build for users to enter keywords::: basic keyboard



Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords
Status: somewhat done but needs more work.


Week 6:
Building Visualization Applet in Processing


Week 7:
Tying in the applet with the searches.
Detailing the visualization.
Building interaction into the Applet
Addressing Speed Concerns


Week 8:
Tweaking vizualisation.More Detaling.
Addressing Speed Concerns.
Polishing the UI.

Monday, February 25, 2008

Understanding what determines Presence of an idea

Websites

Any half-way decent hosting package will include a basic statistics package. Idealware just did a report on web analytic packages that’s definitely worth a read. If you are pressed for time and only want to track a few elements on a monthly basis, my top five would be

- Visits - the number of people looking at each page. This tells you the most popular pages on your site.
- Unique visitors - how many different people are visiting your site, regardless of how many times they returned.
- Referrers - where your visitors were before they came to your site. Are they finding you through Google, by typing in your URL directly, or by clicking on a link from someone else’s site?

- Click Path - where people come into your site, where they go, and where they leave. You can also look at top entry and exit pages, but the full click path gives you a better sense for how people typically use your site.
- Keywords - which words people are using in search engines to find your site (and conversely, which words are important to you that aren’t bringing people in).

Sunday, February 24, 2008

Structuring the Data Streaming



PageRank Explained

At the heart of Google software is a system called PageRank, which basically gives every site on the Internet a rank from 0-10. So how is this calculated? Well, the page rank of your site is determined by the links to your web site. Each time somebody adds a link to your web site, Google interprets this as a vote for your site. The more links you have to your site, the more votes you get.

But Google also looks a little deeper than just sheer volume of links, and analyses the importance of the web site that has cast a vote for your site. Sites that Google determines are important are those with a higher PageRank. So a link to you from a site with a PageRank of 6 is better than a link from a site with a PageRank of 3. In fact, 1 link from a site with a PageRank of 6 is better than 10 links from PageRank 3 sites[1].

Still following? Almost there. When Google is determining how important the link to your site is, it also checks how many other links are on the web page. Take our PageRank 6 page for example. If it has 1000 links on a page, with your site being one of them, Google will determine that the site's 'vote' for your web site is only worth 1/1000 of the PageRank 6 value. If there were only 3 other links on that page their 'vote' for your site will be interpreted by Google as much more important.

http://www.switchit.com/news/improve-pagerank.asp

Tuesday, February 19, 2008

Updated Timeline

Timeline

Week 4:
Rough Prototype
using 2-3 keywords,scraping top links, presenting pages in elemental form
Sorting out Interaction Issues


ISSUES:
-Sorting out content from header and meta tags
-Storing a summary of the web page
-Speed, search threads need to be in independent loops.


Resources for summary

http://developer.yahoo.com/search/


Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords.


Week 6:
Building Visualization Applet in Processing

Week 7:
Creating a Database of 100 words for which the application works as proof of concept, modifying data filtration for these test cases

Week 8:
Work with visuals and user testing.

Scraping the first 10 lines, upto 3 results with multiple querries



String appid = "YbULhZfV34ERqMXMNQP14Opd68RDsU1n0hhNy_kyqZIKEWGJKYNOa7YgWtfmfvs-";
PFont f;
Bubble[] bubbles = new Bubble[1];

SearchClient client = new SearchClient(appid);


void setup() {
size(1200,1200);
bubbles[0] = new Bubble("",0,1000,1000);
f = loadFont("Georgia-24.vlw");
textFont(f);


smooth();


}

void draw() {
background(100);
String searchquery = "processing opengl";
performsearch(searchquery);
noLoop();
}

void performsearch(String query)
{

WebSearchRequest request = new WebSearchRequest(query);

// (Optional) Set the maximum number of results to download
request.setResults(3);
try {
WebSearchResults results = client.webSearch(request);
// Print out how many hits were found
println("Displaying " + results.getTotalResultsReturned() +
" out of " + results.getTotalResultsAvailable() + " hits.");
println();
// Get a list of the search results
WebSearchResult[] resultList = results.listResults();
// Loop through the results and print them to the console

for (int i = 0; i < resultList.length; i++) {
// Print out the document title and URL.
println((i + 1) + ".");
println(resultList[i].getTitle());
println(resultList[i].getUrl());

println();
//now print the content from this specific url
String urlpath =resultList[i].getUrl();
URL url = new URL(urlpath);
InputStream stream = url.openStream();
//Create a BufferedReader from the InputStream
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
String line;
int printcount=0;
while ((( line = reader.readLine()) != null)&&(printcount<10)) {
//draw a bubble for this url and the top 20 lines scraped from it
//int resultList.length = 3;
float r = pow(resultList.length,0.25)*100.8;//all bubbles will be the same size
float spacing = 1.5*(width/resultList.length);
Bubble b = new Bubble(resultList[i].getUrl(), r,bubbles.length*spacing - spacing/2+100*i,300+i*300);//height-i*300-200);
b.display();
// bubbles = (Bubble[]) append(bubbles,b);
if(printcount<10)
{
text(line, bubbles.length*spacing - spacing/2+100*i,300+i*300+printcount*10);//height-i*300 +printcount*10);
fill(102, 102, 153);
textAlign(LEFT);
textFont(f, 12);

print(line+"\n");
printcount++;
}
}
// Close the reader
reader.close();
}

// Error handling below, see the documentation of the Yahoo! API for details
}
catch (IOException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
catch (SearchException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
}

Adding URLs for scraping

Algorithm for scraping data from Links obtained


GET KEYWORDS

SEARCH

BRING BACK TOP 10 LINKS

APPEND LINKS TO LINKED LIST

FOR EACH LINK BRING BACK THE SCRAPE,
APPEND THE URL,
DISPLAY

STOP

Friday, February 15, 2008

Scraping first 10 urls

// Replace this with a developer key from http://developer.yahoo.com
String appid = "YbULhZfV34ERqMXMNQP14Opd68RDsU1n0hhNy_kyqZIKEWGJKYNOa7YgWtfmfvs-";

SearchClient client = new SearchClient(appid);
String query = "processing.org";
WebSearchRequest request = new WebSearchRequest(query);

// (Optional) Set the maximum number of results to download
//request.setResults(30);

try {
WebSearchResults results = client.webSearch(request);
// Print out how many hits were found
println("Displaying " + results.getTotalResultsReturned() +
" out of " + results.getTotalResultsAvailable() + " hits.");
println();
// Get a list of the search results
WebSearchResult[] resultList = results.listResults();
// Loop through the results and print them to the console

for (int i = 0; i < resultList.length; i++) {
// Print out the document title and URL.
println((i + 1) + ".");
println(resultList[i].getTitle());
println(resultList[i].getUrl());
println();
}

// Error handling below, see the documentation of the Yahoo! API for details
}
catch (IOException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
catch (SearchException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}



Here are the results:





Displaying 10 out of 10500 hits.

1.
Processing 1.0 (BETA)
http://processing.org/

2.
Mobile Processing
http://mobile.processing.org/

3.
Download \ Processing 1.0 (BETA)
http://processing.org/download/index.html

4.
Learning \ Processing 1.0 (BETA)
http://processing.org/learning/index.html

5.
Processing 1.0 (BETA) >> Examples
http://dev.processing.org/

6.
hardware.processing.org
http://hardware.processing.org/

7.
width \ Language (API) \ Processing 1.0 (BETA)
http://processing.org/reference/width.html

8.
Environment (IDE) \ Processing 1.0 (BETA)
http://processing.org/reference/environment/index.html

9.
Video \ Libraries \ Processing 1.0 (BETA) \ Processing 1.0 (BETA)
http://processing.org/reference/libraries/video/index.html

10.
int \ Language (API) \ Processing 1.0 (BETA)
http://processing.org/reference/int.html

This is created using Yahoo's Search API.
The yahoo_search-2.0.1 jar file needs to be dragged into the sketch folder along with the code so the search class can be found.

Wednesday, February 13, 2008

FINDCLOUD

Project Context:

Extracting useful data from a visualizations, as a means of departing from visualizations of data that are meaningful to other people.Evolved computing Methods to handle large amounts of data but a lack of focus on representing them in ways that are visually interesting while being useful and invite interaction.
One type of data visualization may be to understand a complex data set.
But if we added a parameter of evolution to a data set, can this information visual be a tool for exploring the original idea or further building on it.
Its interesting that users determine the trend of evolution by modifying the search criteria and the categories, are they looking into blogs or news?
Cloud is an application to look for what's happening with relation to new ideas or what exists out there when u are thinking of prototyping a new concept.
The keywords used in this search could be completely random sets.
The search results are graphically mapped into clouds which represent the following parameters-
-A sense of conveying top hits if this data were to be in list form
-popularity of the cloud
-Relevance to your search


Find cloud Mind Cloud is a data Visualization/Idea exploration tool.
It attempts to visualize activity based on keyword sets that users enter.

While typically a user would search existing activity on eg. Gestures + instruments using a text based search engine.
The users order of exploring the data relies on the order in which the search engines algorithm arranges the Links.In some ways search engines results are equivalent to one page results.
In such cases users modify keywords and its hard to save searches and their contexts unless you navigate away from the current page to start a new search.
Find Cloud visualizes successive search results in the form of layers.
The application attempts to aggregate data sets by eliminating the process of
hit a link
browse away (tab/another page)
Revisit links list
re-enter search keywords
Examine new links
re-assess search criteria based on whether articles meet interests.
While creating different find clouds from user generated keywords, it pulls randomwords from the existing links to create associated word sets word associations that help users think of the searched data in different contexts.

Think Cloud is the cloud that builds as users search using more keywords.
it constructs random word associations with the entered keywords(a,b,c).
The attributes of these random words can be applied to a,b,c to associate new ideas with (abc).

Both versions of the application are collapsable and users can use only the visualization or both.


Inspiration, Experience and Contextual Research:

I became interested in what visualizations mean to users and how they can be personal and useful as thinking tools.Apart from gleaning useful conclusions how can they serve to be explorative.

While many visualizations in categories of Art,Biology,Business Networks (21)
Internet ,Knowledge Networks,Multi-Domain Representation ,Music (18)
were visually pleasing it felt like there was another learning curve to understand the visual presentation of the information.It also felt like very passive information presentation.What did this information mean to me? could I control the variables being visualized.
A list form of data searching is what comes up in search engines.
Users move between links in the order they are presented looking for relevant information.
Is there a way a visualization can be an exploration tool to eliminate the layers of searching for contexts that are new and evolving, where information about the level of activity, user generated content on the subject is valuable.
While most visualizations simply assimilation of large complex data sets, Cloud attempts to add another layer to visualization, one that allows you to explore.

Open Source Programs that work for representing and managing complex computations
http://www.vistrails.org/index.php/Main_Page

Visual depiction techniques/methods to show conceptual uniqueness and originality in the choice of a subject.
http://www.visualcomplexity.com/vc/

Some interesting work in visulizations (Newsmaps,Fidgt etc)
http://64.233.183.104/search?q=cache:E5VWG0vJsyYJ:www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/+http://www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/&hl=en&ct=clnk&cd=1&client=firefox-a

Resource List Wikipedia
http://en.wikipedia.org/wiki/Visualization_(graphic)#Knowledge_visualization

MIT Aesthetics and Computation Group
http://acg.media.mit.edu/

Ben Fry:Genomic Cartography
http://acg.media.mit.edu/people/fry/

Similar Apps: visual exploration of the web
www.walk2web.com

Timeline

Week 4:
Rough Prototype
using 2-3 keywords,scraping top links, presenting pages in elemental form
Sorting out Interaction Issues

Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords.


Week 6:
Building Visualization Applet in Processing

Week 7:
Creating a Database of 100 words for which the application works as proof of concept, modifying data filtration for these test cases

Week 8:
Work with visuals and user testing.







Traditional Methods:


Whats the nature of Search



Typical Process during an unusual search




Clouds/other graphic entities representing links



Browsing one cloud/link at a time with no departure from the main application page.
Original links are present if a user wishes to visit the actual website.




Random word associations as you browse articles/feeds




Visual feedback for links that have been visited, moving them to a different spatial location on the screen, regathering the stack in the same order when links have been viewed and closed




Methods to convey order in which data was ranked/arranged as in the case of a regular search engine.



Thesis Document Outline

Process documentation
1. Personal Statement

2. Context, Background Research, Inspiration

3. Method:

Prototype Design Treatment

Form, Structure

Content

User Scenarios

User Experience

Mechanics

Design Phase

Implementation of Prototype:

Development Schedule

Resources

Description of Development Process

Design Considerations and Development Issues

4.Research:

Formative Design Research/User Testing
Description of Testing, Goals

Description of Process

Sticky Points


II. Publication


1. Title Page

2. Abstract

3. Introduction

Concept Overview

Concept Sketches

Context

Goals

Audience, Location, Interaction Time

Core Features and Functionality


5. Summary.Conclusions.

6. Bibliography

7. Pointers to web-hosted material (appendices, additional bibliographic
referencs, etc.)