Migrating From visualstudio.com TFS to an In-House TFS Server (With Full History)

For the past two years, I have been actively developing a collection of proprietary software products, which I have maintained in one of Microsoft’s free visualstudio.com TFS repositories. For a single developer, this has been a great way to keep a complete revision history on the code base. Recently, I ran into a few caveats that have put me in a situation where it is time for the repository to be moved to a newly-created in-house Team Foundation Server. Thinking to myself “ok, I’ll just migrate the repository real quick”, I set to work. After a few hours of poking around VSO, it became painfully clear that there is no viable way to do this (while keeping the full revision history), since I did not have access to the underlying database for the repository. This gave me two options: 1.) Checkout the latest copy of the repository and commit it to the new repository (keep the latest source but lose the history), or 2.) Write some script that will loop through the repository, checking out each and every revision (starting at 1), and committing them one-by-one to the new repository (keep the source and the history, but may take DAYS or WEEKS to run).

After thinking on the options, I realized that they both sucked. I then remembered that I had written an article awhile back about Migrating from TFS to Git, so I pulled it up, re-read it, and though “I wonder if I can use this same technique to do a TFS to TFS migration?” The answer was yes! Though there was one downside: I kept all of the revision history, but I lost the Changeset Timestamps (all of the migrated history has the same check-in date). In the same fashion of migrating from TFS to Git, I had to use the “git-tf” tool for this process. Here’s how I did it:

1. Installing Git-TF

Download and install the Git-TF utility from the CodePlex page here, and extract it somewhere on your computer. Don’t forget to install the Java Runtime Environment (JRE) if you don’t already have it, it is required for the tool to run.

2. Cloning the TFS repository (with full history)

The next step was a bit trickier. The tool needs the latest copy of the TFS repository that is going to be migrated. However, to clone the repository, I needed to configure Alternate Credentials on my visualstudio.com account. It took a while to find a recent enough article on how to do this. After some trial and error, it’s easier than it should be:

  • Log into your visualstudio.com account
  • In the top-right corner, click on your name and select Security
  • In the new window, click on “Alternate authentication credentials” on the left
  • Make sure the “Enable alternate authentication credentials” checkbox is checked
  • Enter a secondary username and password to use, and click save

Once this was done, it was just one command from a Command Prompt to clone my TFS repo:

git-tf.cmd clone https://jarrenlong.visualstudio.com/DefaultCollection $/RepoIAmMoving –deep

Note: If you didn’t follow the Git-TF instructions and add the Git-TF root directory to your system path, just use the full path to the git-tf.cmd file when executing the command. Since I only planned on using this tool once, this is what I did.

This did take a while to clone, as it is pulling the entire TFS repo history with it. Just let it cook until it’s done. The repository that I was migrating had just over 4900 Changesets, so it ended up taking about 24 hours to do the complete clone. While this is running, it will be a good time to go ahead and create an empty TFS repository on your new server, if you haven’t already done so. For this example, we’ll say that my new Team Foundation Server is accessible at https://tfs.mynewserver.com/DefaultCollection, and the repo I created is called “RepoIAmHosting”.

3. Performing the TFS to TFS Migration

Before you commit the repository to the new server, you need to make a few minor changes:

  • Using Windows Explorer, you need to open the .git directory that was created inside of the cloned repo. There should be a file in there named “git-tf”; rename it to something else. This file tracks all of the Changesets for the repo, but is bound to the old server. If you tried to commit the repo to the new server now, you will most likely get a “Changeset XXX not found” error.
  • Use a text editor to modify the “config” file in the .git directory. This file tells git where the server for the repository is located. In here, you need to modify the [git-tf “server”] section to point to the new server/repository.

For this example, I would change

[git-tf “server”]
collection = https://jarrenlong.visualstudio.com/DefaultCollection
serverpath = $/RepoIAmMoving

To

[git-tf “server”]
collection = https://tfs.mynewserver.com/DefaultCollection
serverpath = $/RepoIAmHosting

Save and close the config file. You are now ready for the actual commit! From the root directory of the repository you cloned, you just need to issue a “git-tf.cmd checkin –deep” command, which will start committing the complete repository to the new server. Again, this is going to take a while, but when the check-in is finished, you will have your complete repository history visible in the new TFS portal. Note: If you need to retain commit usernames, use the –keep-authors flag with this command (see git-tf documentation for info on how this works). In my scenario, I was the only developer on the project, so there was no need for me to do this.

As I said at the beginning of this article, there is only one downfall to this process, which is that each and every Changeset will have the same timestamp (+/- a few minutes). Sadly, this appears to be unavoidable (at least, I have not found a way to preserve the commit timestamps). There is one way that you can (partially) retain the timestamps though. By using the –metadata flag with the checkin command, git-tf will attach the additional metadata for each commit from the old repository. This will preserve the timestamps, however it makes the display of each commit look a little funky in the Changeset list for the repo when viewed from the web portal. Instead of showing the Changeset # and the description attached to the commit, the web portal will just show “Commit xyz (Timestamp)”, and the description of the commit will be embedded further inside of the Changeset’s details.

Like this article? Make a Donation to Feed the Developer!

Teaching the Girlfriend to Code: Part 1, The Absolute Basics

Hello love! This one’s for you. I’m gonna start writing a collection of lessons on how to program, just for you (cause I love you, and you wanted to learn anyways <3).

I guess at some in point in time, I have to start at the very beginning. The very, very beginning, I have an article on the history of computing floating around somewhere, so I’ll leave that part out, and skip to “what is programming for?”

What is programming for?

In short, developing software, AKA computer programming, is a way of commanding a computing device (desktop/laptop, mobile phone, tablet, heavy machinery, etc.) to perform a series repeatable tasks. Simple enough, right? That being said, computers are relatively stupid; they cannot act for themselves, and can only do exactly what you tell them to do, meaning if they do something wrong, it is because they were programmed to do it the wrong way.

Computers are instructed, or programmed, using programming languages. A programming language is exactly that, a language. This language can be translated into something called machine language, which can then be understood by the computer and used to command it. As of right now, there are literally thousands of programming languages that can be used for programming. Some of these languages are mathematical and look very cryptic (example: C and C++), while others and very similar to English, and can be read as such (example: Visual Basic). If you want to see what some of these languages look like, check out the 99 Bottles of Beer website. They have the song 99 bottles of beer written in 1500 different programming languages!

Types of programming languages

With there being literally thousands of programming languages that are used for different purposes, it would be helpful to start breaking them all down into categories. The big three categories are:

  • Markup languages – Are used to decorate a document (like a research paper) to do things like highlight text, indent paragraphs, etc. Originally, this was the main purpose or markup languages, HTML (acronym for HyperText Markup Language) being the first notable markup language. Now, markup languages are pretty much just used to make websites.
  • Scripting languages – Are programs that are written to run inside of and control other programs. These “scripts”  are read by a program, and then the program translates them to perform tasks. The only difference between a scripting language and a regular programming language is that a scripting language can’t run directly on a computer without the help of other programs.
  • Programming languages – Are used to instruct the computer itself to perform tasks.

The programming language category can then be split into three more subcategories:

  • High-Level programming languages – A programming language is considered “high level” if the programmer can use the language without worrying about the details of the computer itself. If you can write your program without having to care about things like “what manufacturer of computer is this program going to work for?” then you are using a high-level language.
  • Low-Level programming languages – A “low level” programming language is a language where the programmer needs to know pretty specific information about the computer they are working on. Low level languages are usually pretty hardware-specific, meaning that your program may only run on certain computers, and not work at all on others.
  • Assembly Language and Machine Code – Assembly language (usually just called assembly or “asm”) and machine code are essentially the same thing. Assembly is a human-readable version of machine code, and machine code is the one and only language computers actually understand. Assembly itself is considered a low-level language, it only works on the hardware that it is designed for.

Let’s take a look at an example of some of these languages. Let’s say we have a variable x, and we want to make it equal to -1. In a high-level programming language, this could look like:

x = -1

If you were to look at a low-level language like assembly, you could probably see something like (at the very guts of it, there would be a lot more work to do to get to this point):

MOV ECX, -1

Cryptic, right? Guess what the computer itself sees? Something along the lines of:

00000030B9FFFFFFFF

in hexadecimal. Or, much more accurately (in binary):

000000000000000000000000001100001011100111111111111111111111111111111111

What the hell is up with all of these weird numbers you ask?

Numbering Systems

This might be a bit of a tangent, but it is good to know. Humans (yourself, and possibly me to) use a Base-10 “Decimal” numbering system for everything, meaning that our numbers go from 0-9. Computers, on the other hand, use a Base-2 “Binary” numbering system, meaning the only numbers that exist are 0 and 1.

Why? Because a computer (a digital one at least), can only ever be turned on, or turned off. There is no in between. That being said, you’ve seen this before (it’s literally tattooed on my arm):

This is the international symbol for the power button, and it is made by taking the number 1 and laying it over the top of the number 0, to represent “on” and “off”. Yay for symbols!

Back on track. Computers only understand binary, and humans are lazy. As an example, to represent the letter ‘A’ (capital A), the decimal number 65 is used. To a computer, 65 is really 01000001 in binary. But who wants to decode that? When talking about bits and bytes (FYI: a “bit” is one binary digit, and can be 0 or 1, and a byte is a group of 8 bits, which can represent any decimal number between 0-255), programmers use the hexadecimal numbering system, which allows us to write the same number in a shorter way. For this example, the decimal 65 (binary 01000001) can be written in hexadecimal as “41”. Don’t worry about figuring out how to translate between the different number systems yet; an advanced topic for another time. There is a handy website here that will show you these conversions, if you’re interested.

So why do computers only understand binary? That’s simple. A computer is just a machine that runs off electricity. Electricity (in it’s simplest form) can be turned on or turned off, just like a light switch. A computer itself is just a collection of millions of light switches wired together. Since these switches can be on or off, we can represent them with two numbers, 0 for off, and 1 for on. When you write code for a computer, all you are doing is telling the computer which switches to turn on and off, and when to do it.

Summary

  • Programming is the art of telling a computer what to do
  • There are thousands of programming languages
  • The main categories of languages are programming, scripting, and markup languages, which all have their own uses
  • Programming languages can be high-level, low-level, or machine code
  • Humans have 10 fingers, computers are light switches

Like this article? Feed the developer, every dollar counts: 

Migrating from TFS to Github

After a good run of using visualstudio.com‘s free Team Foundation Server for storing one of my projects, I decided today that it was time to get all of my projects back in one place. As much as I like TFS, I decided that (since most of my active projects were already over there anyways), that I would go ahead and migrate my project back to my GitHub account. But what to do about my commit history? The last time I migrated a project between Version Control Systems, I lost my entire change history! Not good. A quick search on Google found me exactly what I needed, a nifty little tool called Git-TF, which boasts the ability to migrate repositories from TFS to GitHub without losing any of the repository’s change log.

1. Installing Git-TF

Step 1 in getting the tool to work was to download and install it. This was pretty easy, the CodePlex page for Git-TF has a handy download link to the latest version (which is a zip file). After a quick download & extract, I tried to run the tool, to no avail. Turns out Java is required. Download & install the latest JRE, and retry…got a help message, it runs.

2. Cloning the TFS repository (with full history)

The next step was a bit trickier. The tool needs the latest copy of the TFS repository that is going to be migrated. However, to clone the repository, I needed to configure Alternate Credentials on my visualstudio.com account. It took awhile to find a recent enough article on how to do this. After some trial and error, it’s easier than it should be:

  1. Log into your visualstudio.com account
  2. In the top-right corner, click on your name and select My Security
  3. In the new window, click on “Alternate authentication credentials” on the left
  4. Make sure the “Enable alternate authentication credentials” checkbox is checked
  5. Enter a secondary username and password to use, and click save
  6. Once this was done, it was just one command from a Command Prompt to clone my TFS repo:
git-tf.cmd clone https://jarrenlong.visualstudio.com/DefaultCollection $/Sphere --deep

Note: If you didn’t follow the Git-TF instructions and add the Git-TF root directory to your system path, just use the full path to the fit-tf.cmd file when executing the command. Since I only planned on using this tool once, this is what I did.

This did take awhile to clone, as it is pulling the entire TFS repo history with it. Just let it cook until it’s done. Note that this is the only time that you actually need to use the Git-TF tool. The last few command just use your regular git client.

3. Prepping the TFS repository for migration

Before you finish the migration, you need to clean up a few things:

  • Delete any .vssscc files in your repo, these are TFS source control bindings (you won’t need them anymore)
  • Use a text editor to modify all of your .sln files, and delete the “GlobalSection(TeamFoundationVersionControl) … EndGlobalSection” blocks
  • Optional: Add .gitignore and .gitattributes files to the root directory of your repository to tell Git what to ignore, and how to handle files in your Git repository

4. Create your new GitHub repository

You will need a new GitHub repository to migrate your TFS code into. Log into your GitHub.com account and create a new repository. If you’re reading this article, you probably know how to do this.

5. Commit the TFS repository to GitHub

The last step in the TFS to Git migration is to add all of your project files to your new Git repository, commit them, then bind them to your new Git repository, and then push the changes. Execute these commands from the root of your repository.

git add .
git commit -a -m "Migrated from TFS repo"
git remote add origin https://github.com/JarrenLong/Sphere.git
git push origin master

Once this finishes (again, may take awhile), log into your github.com account and view the repository. Your project and all of its files should now be there, along with their complete history from TFS!

Like this article? Feed the developer, every dollar counts: 

My First WordPress Plugin: Quick DynDNS

I found a PHP code snippet I wrote four or five years ago that allowed me to log IP addresses of website visitors, which I had originally built for providing Dynamic DNS services for an old employer. The code itself was brutish and ugly, but still worked beautifully after all these years. Since I’ve been working on building up https://www.booksnbytes.net for the past few days, I thought it would be fun to try to make a WordPress plugin out of it. 3 hours later, and the Quick DynDNS plugin lives! And so, far it does exactly what it did years ago. Check out the bottom of https://www.booksnbytes.net to see what your current IP address is!

 

Fair warning: I am NOT logging any personal data, nor do I intend to. Yet.