Running simulations and analysing data

My first post in a long time. This is more as a journal entry for me to look back at when I need to.

My PhD project mostly involves running simulations of hundreds of people evacuating from a building and then analysing the simulation in various ways. While the MASON framework in Java helps a lot in the implementation of the model itself, something just as intersting and some thing that in the end feels a lot cooler is running all those simulations and getting data and analysing them.

Step 1: Running multiple Simulations

MASON allows you to run simulations in two major ways : Either using the GUI in which you get to see how the simulation is going. This mode is very useful and essential when creating and debugging the model. However when it comes to actually running simulations and gathering data for analysis, this is quite obviously impractical. This is when the console mode comes in handy. In the console mode, you run several replications of the required simulation with the required seed. Initially I used the handy in built function to do this. I also needed to store the simulation specific settings in some place. Initially I did this using constants in various classes, which I changed to storing all the constants in one class which was a lot more convenient to change and finally I resorted to a much more practical xml file which can easily be read from using JaxB in java. Though I think I might change to an sql based implementation soon. Anyway, the point is, I am able to run my simulation using it’s jar file and an xml file with all the parameters that are used for the simulation.

Step 2: Storing data:

The next step in this process is collecting data from these simulations. As a way to get started I stored my initial files as simple text files in csv format which I analysed in excel. Pretty soon, this became extremely impractical because of the amount of data I had to store. So I changed to storing in binary format and created  parser which would convert generated binary files to text files. I could have used some of java’s inbuilt analysis tools like some provided by the apache framework, but I was quite lazy, and I was working with someone who wanted the text files so that he could analyse it in Matlab, so I resorted to a binary file with a parser to convert to text.

However, despite the organised file hierarchy and names, this was still very difficult to analyse and keep organised and it was still very huge. Also there were a lot of complications when I were writing from multiple runs, experiments, etc. So I switched to what I should actually have done: a relational database. I set up a mysql server instance on my lab computer and wrote all the required database to the file at the end of each run of the simulation.

Step 3: Analysing the data :

Excel being boring, I shall not go into the details of how I did it initially. So once I got the data in MySQL, I needed some tool to analyse it. That’s when my prof recommended using mathplotlib in python. I’ve used python before to create a simple script to clean up references in a text file however, I’ve hardly used it for anything else even though I liked the language a lot. So I decided to give it a try. Interestingly enough I had a lot of trouble finding a free library for mysql. Though once I finally found, mySQLdb, the process of querying and analysing the data and getting some neat graphs took hardly a few lines of code. So now once I had the data, i could simply run the python script and get all the charts I needed.

Step 4: The power of the cloud

A single run of my simulation can take up to 5 minutes. For 100 replications of  under 6 different settings (this is what I needed for the particular run at that time) this would take about 3000 minutes or 50 hours or just over 2 days. While not bad, I needed my computer and I worked at the parallel and distributed computing center so it would have been a waste to not make use of all that computing power at our disposal. So I got myself an account on the cluster and created a simple shell script that would run the simulation with the fixed settings. Eventually I extended this so that it would read parameters from a separate text file, modify the xml file appropriately, and then run the simulation the required number of times and finally, at the end of the run I would be send an email. Here is the code for this first script:

 
#!/bin/bash
# runSimulations = runsSimulations from inputs in file 1

opath=$PATH
PATH=/bin:/usr/bin

case $# in
  0|1) echo 'Usage runSimulations settingsFile xmlFile' 1>&2; exit 1
esac

awk -v xmlFile=$2 '
BEGIN {totalCount=1
startingPoint[1]=1}
{
  model[NR] = $1
  startingPoint[NR+1] = startingPoint[NR]+NF-1
  for(i=2;i<=NF;i++){
    completeValuesList[totalCount] = $i
    totalCount++
  }
}
END {
  startingPoint[NR+1] = totalCount-1
  for (j=0; j<=NR; j++){
    indices[j] = 0
  }

  while(indices[0]!=1){
    timeNeeded=0

    for(j=1;j<=NR;j++){
      value[j] = completeValuesList[startingPoint[j]+indices[j]]
      command = "overwrite " xmlFile " xmlParser " model[j] " " value[j] " " xmlFile
      # print command
      system(command)
    }
    testCommand = "grep FilePath " xmlFile;
    testCommand |getline filePathLine
    close(testCommand)
    seed = 1
    javaCommand = "java -cp dist/CrowdSimulation.jar app.RVOModel -repeat 100 -time 100 -seed " seed
    # print javaCommand
    system(javaCommand)
    for(j=NR;j>=1;j--){
       if(startingPoint[j]+indices[j]==startingPoint[j+1]){
          indices[j]=0
          indices[j-1]++
       }else {
          if(j==NR){
             indices[j]++
          }
       }
    }
 }
}' $1
echo $1 $2 "run complete"|mail -s "Run Complete" vaisaghvt@gmail.com

For anyone with a little experience in shell scripting, this might seem like crap, so if are bored enough to go through this and you know some shell scripting, please do give me any suggestions that you have. That was the code for my first project. In my second project, I’ve changed my approach to having a separate class for each experiment. And also initially, I manually did the work of connecting to each cluster and initializing the job. Now, I’ve automated this too. So I specify the experiment and settings to be run and the script dispatches the jobs to the specified set of clusters and as above, I get emailed at the end when the data is available.

#!/bin/bash
opath=$PATH
PATH=/bin:/usr/bin

case $# in
  0) echo 'Usage runExperiment classToBeRun' 1>&2; exit 1
esac
program=$1

for cluster in "c0-0 0" "c0-1 20" "c0-2 40" "c0-3 60" "c0-4 80" "c0-5 100"
   do
      set -- $cluster
      ssh $1 "nohup ./runCommunication.sh $program $2 2> 2_$2.log 1> 2_$2_1.log < /dev/null &"
      echo "assigned to $1"
   done

SSHing to a remote client and running the command in nohup were the two most difficult parts of this. Nohup lets you run the process even after disconnecting from the machine. The & at the end makes the process run in the background so that you can disconnect and connect to the next machine or do other things. The output is redirected to log files so that I can keep a track of what is happening and finally, something that I took a long time to figure out, you should set input to be received from /dev/null, otherwise you will not be able to disconnect from that particular remote machine.

#!/bin/bash
opath=$PATH
PATH=/bin:/usr/bin

case $# in
  0|1) echo 'Usage runSimulation classFile parameter' 1>&2; exit 1
esac
java -cp IBEVAC.jar $1 $2

echo "$1 $2 run complete"|mail -s "Run Complete" vaisaghvt@gmail.com

There’s still a lot more automation I can and plan to do. But as of now, I’m in a state where I can run simulations quite easilly and I won’t be changing things much for some time. Next stop, getting a proper gitflow happening with Netbeans or eclipse.

Sublime Text Latex Cheat Sheet

(Scroll to bottom for the pdf cheat sheets) 

The quest for efficiency

I’ve been looking around a good editor for latex ever since I started using latex.

When I had this problem when I initially started using Latex during my FYP, I settled for WinEdt which seemed to be the most popular one out there. Since I had to finish my FYP report quickly (I had about a week to setup and learn latex and write my report in it), I didn’t bother experimenting with different editors. The problem for me with latex on windows was that it took me a really long time to set everything up. And for some reason, I couldn’t typeset my tex files into pdf within winedt. I’ve heard this is possible now. I’m not sure if I just never figured it out or the feature wasn’t available in earlier versions.

In any case, once I started PhD and more specifically when I was about to start my confirmation report, there were whole new complications. I had a mac at home and a pc in the lab. I would need to work on my report at both places. I could use Winedt or some such on Windows and TexShop or TexWorks on OSX. But I really didn’t like either of these apps and I wanted to learn to properly use one powerful text editor so that I could work better. So I did a check online and the most fanatical text editor ppl were ppl who used Vim or Emacs. I started using Emacs . Not entirely sure why, but I guess it’s because I found a windows version of emacs and because I heard of Aquamacs for the mac. Within a day or two, I realised that this would be pointless for me. For one, I like using my trackpad and mouse and I like having a proper GUI regardless of how powerful the app can be without it. Moving the cursor in the app was itself a pain, so I didn’t really want to do much more. I was sure there must be some modern text editor that I can use to do all the stuff I need without having to waste so much of my time trying to learn Emacs. I mean, there had to be something better than notepad++ and Textwrangler that was also free.

Sublime Text 2

That’s about when I came across Sublime Text (http://www.sublimetext.com/). It’s powerful, beautiful, has a lot of customizable key bindings, plugin development opportunities (using Python) and most usefully at that point of time it was cross platform. And yep, it was completely free. You could choose to donate to the developer (Jon Skinner) to a licensed version. However, the unlicensed version did all that the licensed version did but with an annoying pop up that comes up at every 10th save or something.

I use it generally for C/C++, python, xml, html and latex. You can use it for all sorts of things. The app also has a console which can be used for doing all sorts of things. For one, when working with python you can actually compile your files and run them from Sublime Text itself. Please take a look at the website for their features. There seems to be a lot of activity on the plugin and package development front over the last few months.

To start installing and discovering packages. First install Sublime Text 2  (http://www.sublimetext.com/2) and then install package control (http://wbond.net/sublime_packages/package_control). And voila! Installing and discovering packages becomes as easy as taking candy from a baby.

Now to start using latex on sublime, you’ll want to install the latexTools package (http://tekonomist.wordpress.com/about/). It provides all sorts of features to you including forward and inverse search, wrapping text in environments and commands, auto completion of references, etc. Combined with the shortcut keys provided by Sublime Text and Papers2’s Manuscripts I have now an Amazing setup for writing my papers. Things have never been easier.

But to actually make full use of these, a cheat sheet is absolutely essential. So I’ve created a pdf of key bindings in sublime and latex tools that’ll be most useful for anyone using sublime for latex.

Cheat sheet:

Windows:  sublimelatexsheet – Windows (p.s. any and all suggestions are welcome)

Mac : sublimelatexsheet -Mac

Tex files for cheat sheet: Tex source files for cheat sheet

*Edit : added some regular expressions that I use for checking to the cheatsheet.

*Edit: Changed for new plugin settings
Update: Installing and updating latextools is a lot easier now. And the plugin is a lot more awesome now. I don’t even need Paper’s manuscripts (which is available now in windows also). Most of this article is old but the latex tools plugin is still updated as of Oct. 24 2012.

*Edit: For updated regexes and a plugin to use these in sublime text please refer to my newer post : https://vaisaghvt.wordpress.com/2013/04/17/a-sublime-text-plugin-for-the-careless/

Why you might not want to use Word in the first place…

In my last post I talked about how to use MS Word properly for reports.  However, for any report more than than a few pages long with a few references, equations and pictures MS Word starts getting cumbersome to use. Placing pictures in a huge report is generally very irritating. It never stays where you want it and sometimes it just disappears from the page altogether.In fact, the difficulty in putting in figures is the best reason to not use Word.  Refering to specific chapters or sections or figures can be difficult. Restructuring the report messes everything. And in a report of a reasonable size , this is bound to happen.

Much more importantly, when you write a report in Word, you’re constantly thinking of how things should look and where they should come, the fonts to use and the styles to use. If you don’t, you’re most likely going to have to restructure a lot after you’ve written everything and this will lead to the same problems I said earlier.  Also, while most people do have office, or at least some software like Pages or Writer that converts the docx files to a “usable” format, a “usable” format almost never looks like it does in Word also there are (increasingly) a lot of people who don’t use Word.

LaTeX is an alternative I’d suggest. Firstly, Latex isn’t for everyone. It’s plain irritating to use if you’re new to it and just writing something small.  But for anything like a thesis or your final year report or your dissertation, LaTeX is the way to go.

In latex, you simply write the text in a generic text file called a .tex file and typeset it using Latex to get a PDF file. The tex file can be opened anywhere on any system since it’s simply a text file. You don’t care about the content and placement, you simply write your content and put in references for images. Latex takes care of putting images properly and getting your references to figures correct. Writing equations is made easy and the equations look beautiful. In fact your whole report looks beautiful and very very professional. Page numbers, tables of content, bibliographies, etc. can be generated with a single command and their formats and styles can be changed easily. Most conferences specify their prefered style and to change styles is as simple as writing a single command or copying a few lines from the specified format.

I won’t go into details here since there are a lot of resources online for this. But I would just like to ask everyone with at least some time to spare to try it. Most resources on setting up latex are just a google or bing search away. Do contact me if you want any help/advice on starting using latex or if you want a few templates to get you started (I’ll upload them anyway, when I get a little time). A few resources to get you started:

http://www.osnews.com/story/10766 : An excellent , slightly long article on why you might want to use latex and also provides a hello world and other basic stuff in latex…

http://www.stdout.org/~winston/latex/latexsheet.pdf : a cheat sheet for latex necessary for anyone using latex regularly.

http://en.wikibooks.org/wiki/LaTeX : A comprehensive and detailed wiki about everything you might want to know in latex.

http://ftp.itb.ac.id/pub/CTAN/info/lshort/english/lshort.pdf : A not so short introduction to latex.

http://www.ats.ucla.edu/stat/latex/ : A bunch of links for latex.

Writing reports in Microsoft Word

Almost all students use MS Word. At NTU, it is almost the only word processing software that students use for writing reports, even PhD students. I would recommend any student with at least a few hours to spare to give Latex a try; it’s totally and completely worth it. However I’ll talk about Latex another time. For now, I would like to talk about ensuring a certain minimum standard for the reports made in MS Word. Most people have no clue of the capabilities of Word and don’t make use of some of the brilliant built-in tools that could both make things more convenient and also much neater.

Even though I’ve always been a big fan of the MS Office suite, I hardly use MS Word these days. I actually stopped using word just over a year and a half back after shifting to latex. I’ve hardly tried Pages or Writer so I can’t comment on how good they are. The last time I tried Writer was too long back for me to comment anything on its current state and not make a fool of myself. So most of these guidelines are specifically for Word users.

  1. Use Justify alignment, 1.5 – 2 line spacing and a font size of at least 9 or 10 depending on the font chosen. Bigger sizes and smaller sizes both make documents either childish looking or too hard to read. And make sure that you use the same font and styling throughout the document. Use some simple font and not something fancy/ weird, especially not comic sans.
  2. Make use of the inbuilt formatting tools. For you chapter headings choose “heading 1” for section headings use “heading 2” and for subsections use “heading 3” . This will ensure uniformity and allow you to easily navigate through the side bar in office 2011 and also helps autogenerate the table of contents. Apply the styles whenever possible even to normal text. This way, to change the style of the document you can just choose one of the styles provided by word or if you want you can create your own custom styles.
  3. Use footers to generate page numbers.
  4. Always do a spell check but don’t always trust the spell checker. Try switching off the autocorrect and work because quite often it makes ridiculous corrections that you hate.
  5. For equations and symbols use the in-built equation and symbol formatter. This is very irritating and slow to use and the quality of rendering is nowhere near as good as in Latex. However, it is very much better than typing out equations which should never be done unless there is no other choice
  6. Insert captions for figures and tables. They should automatically be updated. In case this doesn’t happen, I think it can be done by selecting the whole text and pressing F9.
  7. Use the in-built reference management tool provided in Word. If not, at least use Endnote or Mendeley plugins. I haven’t tried these, but I’m sure they’ll definitely be better than manually ending references. General most references require you to keep a track of the author, year of publication and the journal, website or whatever the source of the article is. Page numbers should also preferably be included. Word provides a relatively easy method of entering these data.
  8. Generate table of contents and bibliography automatically. This will ensure everything keeps updated and more often than not gives a relatively neat output.
  9. Keep saving your work constantly. Preferably keep the report you’re working on in a Dropbox folder. Besides making the document available everywhere and making a reliable back up. Dropbox’s in-built versioning system makes it easy to shift back to earlier versions of your document.
Some general report guidelines:
  1. Write a structure first before starting directly on the text. First write the chapter-wise organisation. Then write down the section organisation and then what you will talk about in each section. Each paragraph should make sense. Each section should be summarized at the end and there should be a smooth logical flow to the next section
  2. Always adhere to the word limit. If you can convey all your ideas in lesser words and pages then you’re more likely to impress your examiner. Using more words generally shows a lack of clarity of the concepts.
  3. Keep figures either at the top or the bottom of a page and always refer to figures in the text.
  4. Have page numbers at the bottom or top of page.
  5. Don’t change font size or spacing from specified value. If you really need more space in the page go to page setup and change the margin sizes.
  6. Always reread your report. If possible get someone else to read it for you. Taking a printout and reading is a very effective way of finding out deficiencies in the documents
  7. Unless you are specifically asked to submit a word document, convert the file to a pdf file. This makes it neater and much more accessible to the professor.
  8. Wherever possible use diagrams, charts, etc. to explain your ideas.

These are just some of the points that come to mind. The two documents below show a lot of much more detailed guidelines for users using Word for their reports. I would recommend anyone writing an article in word to read at least one of them once.

1. www.jasonpang.net/reference/word_report.pdf : Fairly short, well written and extremely useful. Meant for Engineering students at the University of Waterloo.

2. http://www.brad.ac.uk/lss/documentation/word2007-long-document/word2007-long-document.pdf : Similar to above, but slightly more things are covered. I prefer the first one just because it’s much shorter.

Any other guidelines you can think of, please do comment…

Paper Management Tools: What are they? Why use them? and which one to use?

If you’ve been doing any research, be it for an assignment, a report or your dissertation, you will have read at least a few papers or articles or books. While researching for reports during undergrad I used to have a lot of problems frequently :

1. I lost papers or articles that I had read

2. Even in cases where I actually I had papers, I more often than not, forgot what I had read in it and why that particular paper was important.

3. Even if i remembered  a paper and what was in it, it was impossible to find it.

4. For citing the papers in my report, I had to manually enter all the metadata (author name, article, journal, publication date, etc. ) which I could often not find on my printed copy of the paper.

5. I’ve seen huge piles of papers spread across the tables of PhD students. In fact, when I came to my lab for the first time as a PhD student, the size of the stack of papers next to the student indicated how long they had been a research student there.

6. I couldn’t carry all my papers with me.

7. There was also the standard problem of trying to highlight stuff not the paper. Beyond a point, my whole paper would end up highlighted or underlined to show interesting parts.

All these problems can be easily overcome by using online personal digital libraries. The simpler older ones are simply reference managers, like EndNote, Bibdesk, Jabref, etc. The newer ones like Mendeley, wizfolio and Papers are not only reference managers but much more capable pdf organizers as well and for all practical purposes they can be thought of as your own personal library. I’ve tried all these things to different extents over the last two years and I’ll first try to convince you about why it’s really important to use some software like this and also some of the differences that i’ve found between them and why I would recommend some and not the others.

Firstly lets get the simple reference managers out of the way. They are without a doubt extremely nifty tools for paper writing and until recently, it was impossible to work without them. They keep a track of all the papers you’ve read and their metadata. And through plugins and other stuff allow you to easily cite these works in your document. I’ve not really used EndNote much even though that’s the one that I’ve seen most advertised in my university. My supervisor and others who had  used it started cursing the moment I mentioned it, so truthfully, I never gave it a try. But common complaints I’ve heard is that it’s not particularly intuitive. JabRef is a java based (hence cross platform) client for handling your document references provided you use latex. BibDesk is much more powerful but as far as i know exists for mac only. Bibdesk is a seriously good reference manager and allows you to write notes about the paper, store the actual files and other related files and is very stable. I would recommend it to anyone using a mac if it weren’t for software like papers, mendeley, etc.

Zotero is a similar tool which is available as a plugin for firefox.  It can store al sorts of documents with there metadata. The best thing about Zotero is that it can detect some metadata automatically, and that you can organize your papers to categories, give them tags and most importantly, you can directly download them to your database from the internet. Sadly, even though it’s a firefox plugin, the data is stored on your computer itself so unless you find some innovative way to use dropbox there isn’t really anyway you can get your library across computers. And it’s ugly. It takes up a lot of space on your firefox screen and it generally doesn’t look good and isn’t good to use. I gave up on it within a week.

Papers, mendeley and wizfolio are three software that take this to a whole different level. These software in a way are very similar to Bibdesk and zotero, they store documents and metadata and notes about the file and like zotero they have plugins (very buggy ones) that help you import files from websites directly into your database. However, the way the information is presented and the way we interact with the system changes the entire equation.  They have much cleaner, better , more intuitive interfaces, they can most often guess the metadata from the file, they provide you with a facility to read within the app itself and highlight and take notes there itself.  you generally have 3 columns. One with  list of your papers organized in a way you want. Another column shows the paper that you’ve currently selected or the results of the search that you’re currently doing. And the last one is generally the notes you’ve taken down about the particular paper. Besides organizing them in hierarchies you can generally add your own tags. Also they have in built searches that search the document’s text, and metadata and notes for the keyword that you searched. So it becomes a lot easier to find that paper you read three years back which you remember was written by a chap from Cornel and which mentioned a crystal ball in the second page….

Wizfolio is the youngest of the lot and the one I’ve used the least. It’ s very intuitive. It is supported by a lot of online repositories which makes it good at getting metadata. It allows you group into categories and allows you to store all kinds of documents ranging form videos, pictures to website and books.. In fact it even provides a built in youtube search engine. For existing mendeley users it also provides an import from mendeley option the imports all the papers and their metadata without any problems. I’ve also heard they have a very awesome iPad app. The only reasons I don’t use it is because it only there on the cloud and that mostly makes it a pain because i don’t always have internet connection and quite frankly I’d prefer switching of the distraction of the internet when working towards my final deadlines.

Mendeley is perhaps the most popular of the lot. Like Wizfolio your documents are stored on the cloud. Mendeley also tries to advertise itself as the Facebook for academics through their social networking site. In fact, that is the main feature they advertised initially : the ability to collaborate with others. I’ve not really seen much activity on mendeley in my field, so I’ve found this aspect of it quite pointless. The best thing about mendeley is that they have desktop clients for mac and windows ( I think linux too, though I’m not entirely sure) and it is free as long as you have less than 500 GB of papers and documents. Beyond this, you start having to pay a monthly fee. The data being stored on the cloud allows the desktop clients to be synchronized across different computers; Including all the metadata. I’ve sometimes found problems with this syncing though but most of these issues are in some of the notes and similar metadata and mendeley gives a warning when there are conflicts.  It is extremely good at extracting metadata automatically. It allows you to highlight text and add notes and annotations to the papers besides notes about the paper itself. Another useful feature is the ability to create groups in which all the users get to see papers together. This is an invaluable tool for collaboration or for discussing things with your PhD supervisor and/or other colleagues. The interface is neat and keeps improving though it doesn’t look half as good as papers. Mendeley also provides plugins to import directly from the web ;which only downloads the metadata (and not the pdf) , that too to your cloud database. So I found this feature useless.  They also provide plugins for word and open office to cite in word directly. Since I’ve stopped using word for a while now, I’m not entirely sure how good these plugins are. It also has mobiles apps for iPhone, iPad and the android OS. It’s been at least a year since I tried the app out. I initially found it pointless because it would only store references on the mobile by default. Overall, Mendely is amazing and if you use multiple platforms, if you want something that’s free (till 500GB) and if you’re one of the lucky ppl with an actual mendeley community in your area of research, then go ahead and use mendeley.

Finally lets get to Papers which is my favorite. Unlike the previous two, this stores all the data on your hard disk. But by storing the library in your dropbox folder this problem can be easily overcome. As in mendeley, they also have an iPhone version (no android version) and a really really good iPad app which is beautiful and allows to make notes and stuff on the paper. Papers for the mac does not allow you to make annotations directly, though you can open PDFs directly in 3rd party PDF readers like skim which does allow you to annotate. Some of the key advantages of papers over mendeley is it’s beautiful interface. Reading a bunch of dreary papers, this actual does matter quite a bit to me. I love good design. Papers2 allows you to specify as much , if not more, metadata than mendeley. It is also slightly better at grabbing metadata from files than before (though I think mendeley’s is still better). It has an in built search engine for searching multiple databases like google scholar, scopes, web of science, etc.. You can also put in your libary’s proxy so that you get to access this papers and download them directly. They’ve recently included a social element called live to it because of the competition, but I haven’t tried this much and I’m not sure how much it’ll work considering they don’t have a website. It only a free trial for a 30 days, beyond this it is paid software and expensive enough to discourage most students. However, there is one killer feature because of which I willingly paid the registration fee: Papers manuscripts. Generally when we’re citing papers in latex,  we export a .bib file containing all this metadata and use the generated or pre specified ‘cite keys’ to refer to each document to cite. This can invariably be a pain, having to look up the cite key for each paper. Simlarly for word, there are two ways to specify references, either use the in built reference manager provided by Word or use plugins provided by endnote or mendeley. If you had to refer to some paper or citation in something totally different, like a powerpoint presentation or an email you’re sending someone, you were on your own. Papers2 changes all that. Wherever you are, you press ctrl key twice and a small search box opens up, you type a word or a phrase that you remember from that paper (author, title, journal, year, your notes, affiliated university or content or anything else) and the paper is shown to you; you can then insert it as a cite command on papers or as a formatted reference as per your need. So you can literally have all your papers at your fingertips!( I think for citations in word you might have to export to endnote and then use their plugin).

Awesome, eh?

One formatted reference has been inserted. The pop up box appears on pressing ctrl twice. And entering RV returns all the papers with RVO model being used in them and you can choose one to cite or enter a formatted reference

Oh and one last thing, all these softwares being new are being constantly updated and new features are coming at a very frequent rate, so I’m sure this post will be outdated pretty soon. But as of now, I see mendeley updating a lot more frequently than papers. Papers has recently started stating what all they are working on on their forums and this is i believe an extremely good idea.

Addressing some general issues-

The first and most common complaint that I’ve heard is that reading on a screen is a pain. This shouldn’t discourage you from using one of these softwares because, firstly, there is nothing preventing you form actually printing the paper out and reading and just using the software to note down important points. In fact, papers even allows you to keep a track of which papers you have already printed. Besides, I think this problem will be overcome over the next few years by a better kindle or a more widespread use of the e-ink technology.

Entering metadata is a pain; but a lot of these apps get information automatically. However, they’re not really good at it and more often than not you’ll have to enter the metadata in yourself. But this is anyway, necessary since you would need references.

Most of them are free and very intuitive to use. So there is practically no learning curve to start using. In the unlikely case that you do need help, they have good forums.

What if you already have a lot of papers? If you already have soft copies of files you can just drag and drop them inside these. If you have hard copies, papers allows you to directly scan in papers and the others can store images in their library so you could just scan and store in the library. It’s definitely worth the hassle if you’re likely to read a lot more papers over the rest of your career or even just for the next few years.

I can’t think of any other issues, but if you do have any please don’t hesitate to voice it in the comments.

Concluding remarks :

I don’t think a research student should be without a paper management tool like Papers or mendeley. It keeps you organized and it saves all those trees and prevents global warming and helps make the world a better place for you and for me and the entire human race (http://www.youtube.com/watch?v=BWf-eARnf6U&ob=av3e).

Some other (slightly dated) links :

http://www.library.ucsf.edu/help/citemgmt/more : tabular comparison of zotero, mendeley and papers …most of the ‘no’s for papers are ‘yes’s now. I wanted to make something like this. But feels like too much effort for now. I will make one later and edit this post.

http://astuscience.wordpress.com/2009/05/05/papers-or-mendeley/ : Old but interesting nonetheless. Most of the disadvantages mentioned here have been overcome

http://forum.mekentosj.com/viewtopic.php?id=9  : Papers creator in defense of why papers and bid desk have different clientele. Again, this is old, with Papers2 they’ve just taken over bib desk’s clientele too…

If you know any more links for good reviews/ comparison of these software, please do post below and I shall post it here for anyone’s reference.

Why you should listen to Rahman songs multiple times to actually like them…

I’ve always heard told and I’ve always told others that you should listen to AR Rahman’s albums multiple times to actually enjoy it. Many people, especially older people who enjoy lamenting the sad state of Indian music these days, have said this makes no sense and the mere fact that we need to listen to it multiple times shows the drop in quality. Recently, after watching a documentary about the way a human brain works and thinking about my thesis it suddenly struck me of why this is the case and as I now like to say how my thesis is connected to Rahman’s music!

TO understand what I mean, let me first explain the basic idea of the documentary and how the human brain works. While we have 5 sense organs which are helping us constantly sense the world around us our brain does not process everything that we see or hear. This manifests itself in various ways. When you’re immersed in reading a book or looking at some thing you might not hear someone calling you. This is also the basic way in which a magician pulls of his illusions by misdirecting our attention. There was an interesting video in the documentary of a guy presenting a simple magic trick which the viewers were asked to watch carefully. While watching the man do his trick, neither me nor my friend who was watching with me saw a 6 foot tall bear, rabbit and gorilla walk behind the magician. Apparently , even if our brain doesn’t receive all the information obtained by the senses, it still creates a complete and comprehensible picture of the world for us. It is not like we saw a hole in the world where the animals had come.

So how does the brain determine what information is to be processed?  While the exact mechanism is not known, there are a few theories that explain how we process information. Firstly there is the concept of “chunking”. Chunking refers to the process by which we group together similar information into “chunks”.  Chunking information can be thought of as the brain trying to compress more information into lesser space. The original revolutionary paper by Miller that proposed this idea suggested that humans can process 7 +-  2 such chunks of information. More recent studies suggest that we can cognitively process only 4+- 2 chunks of information at any given time. The other important idea is that the brain processes information that it believes to be most relevant or important to the task currently being undertaken. This is why, while reading we see the words clearly while we don’t hear someone calling. This is also why we are so easily fooled by a good magician.

Bear with me for one last interesting and relevant idea which is easier to explain with an example. Have you ever noticed that as soon as you learn a new word and its meaning, we start noticing it everywhere. It’s only recently I learned the singlish word ‘ang mo’. Since then I’ve been hearing all the local singaporean taxi drivers and hawker stall uncles use the term. Obviously, I’ve just not been listening to that word even though I’ve been hearing it. Though I don’t have references for this I think this can be easily explained. When we learn something new our brain learns to associate that “something” with other things and thus it can more effectively make it a part of some pre existing “chunks” and encode the information for storing. Basically, the brain finds efficient ways to encode information so that it can work around its limited capacity.

So how does all this relate to a Rahman song? Well, I’ll need one more example for this. When trying to sing a song, a trained singer will be more easily able to grasp the notes and the tune because he is able to understand and identify the notes better. An untrained singer might be able to sing just as well but he will most probably have to listen to it a few more times to ascertain the notes, the subtleties and the intricacies in the music. Finally, what’s so special about Rahman’s music? More often than not, a lot of his songs are very complex with multiple layers to the music. This means that when we hear the song the first time our brain only processes some of these “layers”. But on listening to it a second time, since the “layer” that we have heard before is already familiar, our brain manages to encode it efficiently which frees up space for us to take in more “layers”. Don’t think it makes sense? To take a recent example consider the song “hawa hawa” from the movie Rockstar. It is unlikely that a person hearing the song for the first time will hear the intricate music being played on the violin (or is it some other string instrument?? ) in the background. So only after listening to the songs a few times are we able to process all the different instruments and the subtleties in the music and finally hear it in all it’s glory…

Another interesting source on crowd dynamics

I’m finally realising the usefulness of twitter and in the most unlikeliest of ways. I’d been follwoing cracked.com mostly for the jokes and then they tweeted this earlier today:

http://www.cracked.com/article_19004_6-things-that-annoy-you-every-day-explained-by-science.html

And that had some references to annoying things in crowds and some actual references to work being done in crowd dynamics. That too the exact kind of work I was looking for. And I’ve stumbled on a few new papers and research. Shall post the updates soon. But here are the first interesting page I found: http://mehdimoussaid.com/project4.html . Just posting so that i don’t forget.