Monthly Archives: September 2009

Data Mining and Statistics

When I first heard of Data Mining, I wondered what the difference was between it and simple indexing on one hand or statistics on the other. Was this just another buzz-word?

Since then, I’ve been been asked the same question by others. ┬áThis week, I came across an interesting article on the Data Miners Blog that did a good job of outlining some of the similarities and differences between Data Mining and Statistics.

In a nutshell, Data Mining is concerned with the process of obtaining meaningful information from raw data, some of which is not necessarily apparent to human observers. Statistical methods are some of the tools that can be used in this process, but there are also many other pieces of the puzzle.

Basic Install and Setup of Subversion (SVN) on Windows

I am going through the process of setting up SVN on Windows and later will be adding authentication via Active Directory. I plan to post about each step along the way, as a resource to myself for future work, and maybe it will help someone else as well.

Basic Installation and Setup Instructions

Environment:

  • Subversion
  • Apache
  • Windows Server 2003
  • No Authentication

1. Get the windows binaries
http://subversion.tigris.org/getting.html#windows

2. Run the install, make sure to check the box to include the apache modules.

Note:

  • This install also brings a copy of Apache with it, so you don’t have to download and install it separately.
  • The Apache service in the windows services list is: “CollabNet Subversion Apache”
  • The install directory for Apache is (by default): C:Program FilesCollabNetSubversion Serverhttpd

3. Open port 80 (or whatever port you chose) in Windows Firewall

4. Create the SVN repository

  • The install will create a directory “C:svn_repository” but it will not set it up as a repository
  • From a command prompt run: “svnadmin create c:svn_repository”
  • This will add a handful of subdirectories and files

5. Ensure needed modules are present and “required” in Apache.

  • This came automatically from the install for me, but if you’re using a separate Apache install you’ll need these
  • In httpdmodules dir:
    • mod_dav.so
    • mod_dav_svn.so
  • Corresponding “require” statements should exist in httpd/conf/httpd.conf
    • LoadModule dav_module modules/mod_dav.so
    • LoadModule dav_svn_module modules/mod_dav_svn.so

6: Add SVN “Location” information to the bottom of the httpd/conf/httpd.conf file.

<Location /repos>
DAV svn

SVNPath C:svn_repository
</Location>

  • Note that you want “SVNPath” Not “SVNParentPath” unless you are trying to have a parent folder with multiple repositories under it.
  • SVNPath should point to the directory where you created the repository earlier
  • /repos is what will be in your URL to request content, ie http://server/repos

Testing Your Setup

To test Apache, go to the default page in a browser:

http://server/

This should display a page that says “It Worked”

Once you know Apache is up, you can try to browse your your repository with a Tortoise SVN Repo browser.

1. Point it at: http://server/repos (where /repos is according to what you put in the httpd.conf file earlier)

This should allow you to browse (an empty repo).

2. In the repo browser, create a test folder.
3. Do a checkout of this folder to your local machine
4. Create a new text file in that folder on the hard drive.
5. Do an svn-add on that file
6. Do an svn-commit on that file
7. Open a new repo browser and browse into your new folder and ensure that the file exists.