Wednesday, February 27, 2008
Ranking URL's + frequency

What Determines Presence?

Using Page Rank + Word hit

1.
http://new.music.yahoo.com/has freq 0
2.
http://www.music.com/has freq 3
3.
http://www.allmusic.com/has freq 1
4.
http://music.aol.com/has freq 8
5.
http://www.amazon.com/music-rock-classical-pop-jazz/b?ie=UTF8&node=5174has freq 1
6.
http://www.myspace.com/index.cfm?fuseaction=musichas freq 0
7.
http://www.mtv.com/has freq 0
8.
http://dir.yahoo.com/Entertainment/Musichas freq 0
9.
http://www.npr.org/music/has freq 0
10.
http://www.billboard.com/has freq

Updated Timeline:
Week 5:
Work on Filtering Techniques:::added word freq to vis
Rough UI build for users to enter keywords::: basic keyboard
Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords
Status: somewhat done but needs more work.
Week 6:
Building Visualization Applet in Processing
Week 7:
Tying in the applet with the searches.
Detailing the visualization.
Building interaction into the Applet
Addressing Speed Concerns
Week 8:
Tweaking vizualisation.More Detaling.
Addressing Speed Concerns.
Polishing the UI.
Monday, February 25, 2008
Understanding what determines Presence of an idea
Websites
Any half-way decent hosting package will include a basic statistics package. Idealware just did a report on web analytic packages that’s definitely worth a read. If you are pressed for time and only want to track a few elements on a monthly basis, my top five would be
- Visits - the number of people looking at each page. This tells you the most popular pages on your site.
- Unique visitors - how many different people are visiting your site, regardless of how many times they returned.
- Referrers - where your visitors were before they came to your site. Are they finding you through Google, by typing in your URL directly, or by clicking on a link from someone else’s site?
- Click Path - where people come into your site, where they go, and where they leave. You can also look at top entry and exit pages, but the full click path gives you a better sense for how people typically use your site.
- Keywords - which words people are using in search engines to find your site (and conversely, which words are important to you that aren’t bringing people in).
Any half-way decent hosting package will include a basic statistics package. Idealware just did a report on web analytic packages that’s definitely worth a read. If you are pressed for time and only want to track a few elements on a monthly basis, my top five would be
- Visits - the number of people looking at each page. This tells you the most popular pages on your site.
- Unique visitors - how many different people are visiting your site, regardless of how many times they returned.
- Referrers - where your visitors were before they came to your site. Are they finding you through Google, by typing in your URL directly, or by clicking on a link from someone else’s site?
- Click Path - where people come into your site, where they go, and where they leave. You can also look at top entry and exit pages, but the full click path gives you a better sense for how people typically use your site.
- Keywords - which words people are using in search engines to find your site (and conversely, which words are important to you that aren’t bringing people in).
Sunday, February 24, 2008
Structuring the Data Streaming

PageRank Explained
At the heart of Google software is a system called PageRank, which basically gives every site on the Internet a rank from 0-10. So how is this calculated? Well, the page rank of your site is determined by the links to your web site. Each time somebody adds a link to your web site, Google interprets this as a vote for your site. The more links you have to your site, the more votes you get.
But Google also looks a little deeper than just sheer volume of links, and analyses the importance of the web site that has cast a vote for your site. Sites that Google determines are important are those with a higher PageRank. So a link to you from a site with a PageRank of 6 is better than a link from a site with a PageRank of 3. In fact, 1 link from a site with a PageRank of 6 is better than 10 links from PageRank 3 sites[1].
Still following? Almost there. When Google is determining how important the link to your site is, it also checks how many other links are on the web page. Take our PageRank 6 page for example. If it has 1000 links on a page, with your site being one of them, Google will determine that the site's 'vote' for your web site is only worth 1/1000 of the PageRank 6 value. If there were only 3 other links on that page their 'vote' for your site will be interpreted by Google as much more important.
http://www.switchit.com/news/improve-pagerank.asp
Tuesday, February 19, 2008
Updated Timeline
Timeline
Week 4:
Rough Prototype
using 2-3 keywords,scraping top links, presenting pages in elemental form
Sorting out Interaction Issues
ISSUES:
-Sorting out content from header and meta tags
-Storing a summary of the web page
-Speed, search threads need to be in independent loops.
Resources for summary
http://developer.yahoo.com/search/
Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords.
Week 6:
Building Visualization Applet in Processing
Week 7:
Creating a Database of 100 words for which the application works as proof of concept, modifying data filtration for these test cases
Week 8:
Work with visuals and user testing.
Week 4:
Rough Prototype
using 2-3 keywords,scraping top links, presenting pages in elemental form
Sorting out Interaction Issues
ISSUES:
-Sorting out content from header and meta tags
-Storing a summary of the web page
-Speed, search threads need to be in independent loops.
Resources for summary
http://developer.yahoo.com/search/
Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords.
Week 6:
Building Visualization Applet in Processing
Week 7:
Creating a Database of 100 words for which the application works as proof of concept, modifying data filtration for these test cases
Week 8:
Work with visuals and user testing.
Scraping the first 10 lines, upto 3 results with multiple querries


String appid = "YbULhZfV34ERqMXMNQP14Opd68RDsU1n0hhNy_kyqZIKEWGJKYNOa7YgWtfmfvs-";
PFont f;
Bubble[] bubbles = new Bubble[1];
SearchClient client = new SearchClient(appid);
void setup() {
size(1200,1200);
bubbles[0] = new Bubble("",0,1000,1000);
f = loadFont("Georgia-24.vlw");
textFont(f);
smooth();
}
void draw() {
background(100);
String searchquery = "processing opengl";
performsearch(searchquery);
noLoop();
}
void performsearch(String query)
{
WebSearchRequest request = new WebSearchRequest(query);
// (Optional) Set the maximum number of results to download
request.setResults(3);
try {
WebSearchResults results = client.webSearch(request);
// Print out how many hits were found
println("Displaying " + results.getTotalResultsReturned() +
" out of " + results.getTotalResultsAvailable() + " hits.");
println();
// Get a list of the search results
WebSearchResult[] resultList = results.listResults();
// Loop through the results and print them to the console
for (int i = 0; i < resultList.length; i++) {
// Print out the document title and URL.
println((i + 1) + ".");
println(resultList[i].getTitle());
println(resultList[i].getUrl());
println();
//now print the content from this specific url
String urlpath =resultList[i].getUrl();
URL url = new URL(urlpath);
InputStream stream = url.openStream();
//Create a BufferedReader from the InputStream
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
String line;
int printcount=0;
while ((( line = reader.readLine()) != null)&&(printcount<10)) {
//draw a bubble for this url and the top 20 lines scraped from it
//int resultList.length = 3;
float r = pow(resultList.length,0.25)*100.8;//all bubbles will be the same size
float spacing = 1.5*(width/resultList.length);
Bubble b = new Bubble(resultList[i].getUrl(), r,bubbles.length*spacing - spacing/2+100*i,300+i*300);//height-i*300-200);
b.display();
// bubbles = (Bubble[]) append(bubbles,b);
if(printcount<10)
{
text(line, bubbles.length*spacing - spacing/2+100*i,300+i*300+printcount*10);//height-i*300 +printcount*10);
fill(102, 102, 153);
textAlign(LEFT);
textFont(f, 12);
print(line+"\n");
printcount++;
}
}
// Close the reader
reader.close();
}
// Error handling below, see the documentation of the Yahoo! API for details
}
catch (IOException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
catch (SearchException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
}
Adding URLs for scraping
Algorithm for scraping data from Links obtained
GET KEYWORDS
SEARCH
BRING BACK TOP 10 LINKS
APPEND LINKS TO LINKED LIST
FOR EACH LINK BRING BACK THE SCRAPE,
APPEND THE URL,
DISPLAY
STOP
GET KEYWORDS
SEARCH
BRING BACK TOP 10 LINKS
APPEND LINKS TO LINKED LIST
FOR EACH LINK BRING BACK THE SCRAPE,
APPEND THE URL,
DISPLAY
STOP
Friday, February 15, 2008
Scraping first 10 urls
// Replace this with a developer key from http://developer.yahoo.com
String appid = "YbULhZfV34ERqMXMNQP14Opd68RDsU1n0hhNy_kyqZIKEWGJKYNOa7YgWtfmfvs-";
SearchClient client = new SearchClient(appid);
String query = "processing.org";
WebSearchRequest request = new WebSearchRequest(query);
// (Optional) Set the maximum number of results to download
//request.setResults(30);
try {
WebSearchResults results = client.webSearch(request);
// Print out how many hits were found
println("Displaying " + results.getTotalResultsReturned() +
" out of " + results.getTotalResultsAvailable() + " hits.");
println();
// Get a list of the search results
WebSearchResult[] resultList = results.listResults();
// Loop through the results and print them to the console
for (int i = 0; i < resultList.length; i++) {
// Print out the document title and URL.
println((i + 1) + ".");
println(resultList[i].getTitle());
println(resultList[i].getUrl());
println();
}
// Error handling below, see the documentation of the Yahoo! API for details
}
catch (IOException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
catch (SearchException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
Here are the results:
Displaying 10 out of 10500 hits.
1.
Processing 1.0 (BETA)
http://processing.org/
2.
Mobile Processing
http://mobile.processing.org/
3.
Download \ Processing 1.0 (BETA)
http://processing.org/download/index.html
4.
Learning \ Processing 1.0 (BETA)
http://processing.org/learning/index.html
5.
Processing 1.0 (BETA) >> Examples
http://dev.processing.org/
6.
hardware.processing.org
http://hardware.processing.org/
7.
width \ Language (API) \ Processing 1.0 (BETA)
http://processing.org/reference/width.html
8.
Environment (IDE) \ Processing 1.0 (BETA)
http://processing.org/reference/environment/index.html
9.
Video \ Libraries \ Processing 1.0 (BETA) \ Processing 1.0 (BETA)
http://processing.org/reference/libraries/video/index.html
10.
int \ Language (API) \ Processing 1.0 (BETA)
http://processing.org/reference/int.html
This is created using Yahoo's Search API.
The yahoo_search-2.0.1 jar file needs to be dragged into the sketch folder along with the code so the search class can be found.
String appid = "YbULhZfV34ERqMXMNQP14Opd68RDsU1n0hhNy_kyqZIKEWGJKYNOa7YgWtfmfvs-";
SearchClient client = new SearchClient(appid);
String query = "processing.org";
WebSearchRequest request = new WebSearchRequest(query);
// (Optional) Set the maximum number of results to download
//request.setResults(30);
try {
WebSearchResults results = client.webSearch(request);
// Print out how many hits were found
println("Displaying " + results.getTotalResultsReturned() +
" out of " + results.getTotalResultsAvailable() + " hits.");
println();
// Get a list of the search results
WebSearchResult[] resultList = results.listResults();
// Loop through the results and print them to the console
for (int i = 0; i < resultList.length; i++) {
// Print out the document title and URL.
println((i + 1) + ".");
println(resultList[i].getTitle());
println(resultList[i].getUrl());
println();
}
// Error handling below, see the documentation of the Yahoo! API for details
}
catch (IOException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
catch (SearchException e) {
println("Error calling Yahoo! Search Service: " + e.toString());
e.printStackTrace();
}
Here are the results:
Displaying 10 out of 10500 hits.
1.
Processing 1.0 (BETA)
http://processing.org/
2.
Mobile Processing
http://mobile.processing.org/
3.
Download \ Processing 1.0 (BETA)
http://processing.org/download/index.html
4.
Learning \ Processing 1.0 (BETA)
http://processing.org/learning/index.html
5.
Processing 1.0 (BETA) >> Examples
http://dev.processing.org/
6.
hardware.processing.org
http://hardware.processing.org/
7.
width \ Language (API) \ Processing 1.0 (BETA)
http://processing.org/reference/width.html
8.
Environment (IDE) \ Processing 1.0 (BETA)
http://processing.org/reference/environment/index.html
9.
Video \ Libraries \ Processing 1.0 (BETA) \ Processing 1.0 (BETA)
http://processing.org/reference/libraries/video/index.html
10.
int \ Language (API) \ Processing 1.0 (BETA)
http://processing.org/reference/int.html
This is created using Yahoo's Search API.
The yahoo_search-2.0.1 jar file needs to be dragged into the sketch folder along with the code so the search class can be found.
Wednesday, February 13, 2008
FINDCLOUD
Project Context:
Extracting useful data from a visualizations, as a means of departing from visualizations of data that are meaningful to other people.Evolved computing Methods to handle large amounts of data but a lack of focus on representing them in ways that are visually interesting while being useful and invite interaction.
One type of data visualization may be to understand a complex data set.
But if we added a parameter of evolution to a data set, can this information visual be a tool for exploring the original idea or further building on it.
Its interesting that users determine the trend of evolution by modifying the search criteria and the categories, are they looking into blogs or news?
Cloud is an application to look for what's happening with relation to new ideas or what exists out there when u are thinking of prototyping a new concept.
The keywords used in this search could be completely random sets.
The search results are graphically mapped into clouds which represent the following parameters-
-A sense of conveying top hits if this data were to be in list form
-popularity of the cloud
-Relevance to your search
Find cloud Mind Cloud is a data Visualization/Idea exploration tool.
It attempts to visualize activity based on keyword sets that users enter.
While typically a user would search existing activity on eg. Gestures + instruments using a text based search engine.
The users order of exploring the data relies on the order in which the search engines algorithm arranges the Links.In some ways search engines results are equivalent to one page results.
In such cases users modify keywords and its hard to save searches and their contexts unless you navigate away from the current page to start a new search.
Find Cloud visualizes successive search results in the form of layers.
The application attempts to aggregate data sets by eliminating the process of
hit a link
browse away (tab/another page)
Revisit links list
re-enter search keywords
Examine new links
re-assess search criteria based on whether articles meet interests.
While creating different find clouds from user generated keywords, it pulls randomwords from the existing links to create associated word sets word associations that help users think of the searched data in different contexts.
Think Cloud is the cloud that builds as users search using more keywords.
it constructs random word associations with the entered keywords(a,b,c).
The attributes of these random words can be applied to a,b,c to associate new ideas with (abc).
Both versions of the application are collapsable and users can use only the visualization or both.
Inspiration, Experience and Contextual Research:
I became interested in what visualizations mean to users and how they can be personal and useful as thinking tools.Apart from gleaning useful conclusions how can they serve to be explorative.
While many visualizations in categories of Art,Biology,Business Networks (21)
Internet ,Knowledge Networks,Multi-Domain Representation ,Music (18)
were visually pleasing it felt like there was another learning curve to understand the visual presentation of the information.It also felt like very passive information presentation.What did this information mean to me? could I control the variables being visualized.
A list form of data searching is what comes up in search engines.
Users move between links in the order they are presented looking for relevant information.
Is there a way a visualization can be an exploration tool to eliminate the layers of searching for contexts that are new and evolving, where information about the level of activity, user generated content on the subject is valuable.
While most visualizations simply assimilation of large complex data sets, Cloud attempts to add another layer to visualization, one that allows you to explore.
Open Source Programs that work for representing and managing complex computations
http://www.vistrails.org/index.php/Main_Page
Visual depiction techniques/methods to show conceptual uniqueness and originality in the choice of a subject.
http://www.visualcomplexity.com/vc/
Some interesting work in visulizations (Newsmaps,Fidgt etc)
http://64.233.183.104/search?q=cache:E5VWG0vJsyYJ:www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/+http://www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/&hl=en&ct=clnk&cd=1&client=firefox-a
Resource List Wikipedia
http://en.wikipedia.org/wiki/Visualization_(graphic)#Knowledge_visualization
MIT Aesthetics and Computation Group
http://acg.media.mit.edu/
Ben Fry:Genomic Cartography
http://acg.media.mit.edu/people/fry/
Similar Apps: visual exploration of the web
www.walk2web.com
Timeline
Week 4:
Rough Prototype
using 2-3 keywords,scraping top links, presenting pages in elemental form
Sorting out Interaction Issues
Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords.
Week 6:
Building Visualization Applet in Processing
Week 7:
Creating a Database of 100 words for which the application works as proof of concept, modifying data filtration for these test cases
Week 8:
Work with visuals and user testing.
Traditional Methods:
Whats the nature of Search

Typical Process during an unusual search


Clouds/other graphic entities representing links

Browsing one cloud/link at a time with no departure from the main application page.
Original links are present if a user wishes to visit the actual website.

Random word associations as you browse articles/feeds

Visual feedback for links that have been visited, moving them to a different spatial location on the screen, regathering the stack in the same order when links have been viewed and closed

Methods to convey order in which data was ranked/arranged as in the case of a regular search engine.

Thesis Document Outline
Process documentation
1. Personal Statement
2. Context, Background Research, Inspiration
3. Method:
Prototype Design Treatment
Form, Structure
Content
User Scenarios
User Experience
Mechanics
Design Phase
Implementation of Prototype:
Development Schedule
Resources
Description of Development Process
Design Considerations and Development Issues
4.Research:
Formative Design Research/User Testing
Description of Testing, Goals
Description of Process
Sticky Points
II. Publication
1. Title Page
2. Abstract
3. Introduction
Concept Overview
Concept Sketches
Context
Goals
Audience, Location, Interaction Time
Core Features and Functionality
5. Summary.Conclusions.
6. Bibliography
7. Pointers to web-hosted material (appendices, additional bibliographic
referencs, etc.)
Extracting useful data from a visualizations, as a means of departing from visualizations of data that are meaningful to other people.Evolved computing Methods to handle large amounts of data but a lack of focus on representing them in ways that are visually interesting while being useful and invite interaction.
One type of data visualization may be to understand a complex data set.
But if we added a parameter of evolution to a data set, can this information visual be a tool for exploring the original idea or further building on it.
Its interesting that users determine the trend of evolution by modifying the search criteria and the categories, are they looking into blogs or news?
Cloud is an application to look for what's happening with relation to new ideas or what exists out there when u are thinking of prototyping a new concept.
The keywords used in this search could be completely random sets.
The search results are graphically mapped into clouds which represent the following parameters-
-A sense of conveying top hits if this data were to be in list form
-popularity of the cloud
-Relevance to your search
Find cloud Mind Cloud is a data Visualization/Idea exploration tool.
It attempts to visualize activity based on keyword sets that users enter.
While typically a user would search existing activity on eg. Gestures + instruments using a text based search engine.
The users order of exploring the data relies on the order in which the search engines algorithm arranges the Links.In some ways search engines results are equivalent to one page results.
In such cases users modify keywords and its hard to save searches and their contexts unless you navigate away from the current page to start a new search.
Find Cloud visualizes successive search results in the form of layers.
The application attempts to aggregate data sets by eliminating the process of
hit a link
browse away (tab/another page)
Revisit links list
re-enter search keywords
Examine new links
re-assess search criteria based on whether articles meet interests.
While creating different find clouds from user generated keywords, it pulls randomwords from the existing links to create associated word sets word associations that help users think of the searched data in different contexts.
Think Cloud is the cloud that builds as users search using more keywords.
it constructs random word associations with the entered keywords(a,b,c).
The attributes of these random words can be applied to a,b,c to associate new ideas with (abc).
Both versions of the application are collapsable and users can use only the visualization or both.
Inspiration, Experience and Contextual Research:
I became interested in what visualizations mean to users and how they can be personal and useful as thinking tools.Apart from gleaning useful conclusions how can they serve to be explorative.
While many visualizations in categories of Art,Biology,Business Networks (21)
Internet ,Knowledge Networks,Multi-Domain Representation ,Music (18)
were visually pleasing it felt like there was another learning curve to understand the visual presentation of the information.It also felt like very passive information presentation.What did this information mean to me? could I control the variables being visualized.
A list form of data searching is what comes up in search engines.
Users move between links in the order they are presented looking for relevant information.
Is there a way a visualization can be an exploration tool to eliminate the layers of searching for contexts that are new and evolving, where information about the level of activity, user generated content on the subject is valuable.
While most visualizations simply assimilation of large complex data sets, Cloud attempts to add another layer to visualization, one that allows you to explore.
Open Source Programs that work for representing and managing complex computations
http://www.vistrails.org/index.php/Main_Page
Visual depiction techniques/methods to show conceptual uniqueness and originality in the choice of a subject.
http://www.visualcomplexity.com/vc/
Some interesting work in visulizations (Newsmaps,Fidgt etc)
http://64.233.183.104/search?q=cache:E5VWG0vJsyYJ:www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/+http://www.smashingmagazine.com/2007/08/02/data-visualization-modern-approaches/&hl=en&ct=clnk&cd=1&client=firefox-a
Resource List Wikipedia
http://en.wikipedia.org/wiki/Visualization_(graphic)#Knowledge_visualization
MIT Aesthetics and Computation Group
http://acg.media.mit.edu/
Ben Fry:Genomic Cartography
http://acg.media.mit.edu/people/fry/
Similar Apps: visual exploration of the web
www.walk2web.com
Timeline
Week 4:
Rough Prototype
using 2-3 keywords,scraping top links, presenting pages in elemental form
Sorting out Interaction Issues
Week 5:
Work on Filtering Techniques,Rough UI build for users to enter keywords.
Week 6:
Building Visualization Applet in Processing
Week 7:
Creating a Database of 100 words for which the application works as proof of concept, modifying data filtration for these test cases
Week 8:
Work with visuals and user testing.
Traditional Methods:
Whats the nature of Search

Typical Process during an unusual search


Clouds/other graphic entities representing links

Browsing one cloud/link at a time with no departure from the main application page.
Original links are present if a user wishes to visit the actual website.

Random word associations as you browse articles/feeds

Visual feedback for links that have been visited, moving them to a different spatial location on the screen, regathering the stack in the same order when links have been viewed and closed

Methods to convey order in which data was ranked/arranged as in the case of a regular search engine.

Thesis Document Outline
Process documentation
1. Personal Statement
2. Context, Background Research, Inspiration
3. Method:
Prototype Design Treatment
Form, Structure
Content
User Scenarios
User Experience
Mechanics
Design Phase
Implementation of Prototype:
Development Schedule
Resources
Description of Development Process
Design Considerations and Development Issues
4.Research:
Formative Design Research/User Testing
Description of Testing, Goals
Description of Process
Sticky Points
II. Publication
1. Title Page
2. Abstract
3. Introduction
Concept Overview
Concept Sketches
Context
Goals
Audience, Location, Interaction Time
Core Features and Functionality
5. Summary.Conclusions.
6. Bibliography
7. Pointers to web-hosted material (appendices, additional bibliographic
referencs, etc.)
Subscribe to:
Posts (Atom)