What are really the sets of Git modifications?

6

I'm now starting to use Git and one thing I heard in the course I'm seeing was this: what Git actually saves are not the different versions of the files, but sets of modifications.

In this way, by doing a commit , instead of saving the files in the state they are in the repository, what I understood from this is that Git saves in the repository a set of modifications containing the modifications that were made.

My question

Are the modifications to what state of the project? The initial state or state of the last commit?

What I want to say basically is this: let's suppose that soon after we created the repository we added a file arquivo1.txt , when we commit, the modification that Git will register is the creation of that file. After that we create a new file arquivo2.txt and add a line to arquivo1.txt and commit.

In this case what does Git save on the second commit? The registered modification is the addition of the arquivo2.txt and the addition of the line (that is, relative to the last modification) or is the addition of the file arquivo1.txt and arquivo2.txt as well as the modification in the first file (that is, the total modification from the initial state)?

It seems to me that what Git saves is actually the modification with respect to the last modification, since if it saved the modification relative to the initial state it would be equivalent to saving the different versions of the files themselves.

I'm really not really understanding yet what these modification sets are that Git registers on every commit. What are they really?

    
asked by anonymous 28.01.2015 / 21:29

2 answers

3

These terms are not always used correctly.

Git uses snapshot which is a state of content at any given time, so it is also known as point in time . These snapshots are from the entire repository. Hence there is a limitation on Git so that any update affects the entire repository and some stash for example *) should be used to prevent some of the content being manipulated from being committed in the repository.

So that you do not have to store all the new content in each commit carried out a differential mechanism is used and only one delta encoding of each change is stored. These differentials are obtained between the previous version and the current version. Yet repositories exist by themselves. This form allows multiple lines of development to compete simultaneously.

Other version control software can use changesets or sets of modifications. These are the differences between what existed in the repository and what is being confirmed now. This subtle difference makes it difficult to work with several competing sources of change. But it makes it easier to partially confirm the changes. This way you can easily select what you want to confirm and set up a set of modifications only with what you want at that moment. Repositories are assembled from your changes.

Strictly speaking in Git we can not make a set of modifications. Informally this term ends up being used.

I do not really know the mechanics of these software and it is not easy to find definitions that do not create confusion. Apparently I am not the only one who did not feel impeded from talking about something he does not know in every detail: P. It is easy to find contradictory information on the subject. But according to the best known documentation this is how it works.

    
29.01.2015 / 06:42
0

What are sets of changes in Git

Changeset is for Changeset , which is not related to how Git keeps changes.

> Changeset is a fundamental concept of Git and is also present in many other source code control systems.

  

The basic idea of a changeset is to commit a set of changes in an atomic way, ie: either all set changes are committed successfully, or none are. We can make an analogy with a database transaction that either guarantees the persistence of multiple records at once, or rolls back in case the persistence of one of the records fails.

It's a fact that changesets in Git go beyond changesets in other version control systems. In git you can, for example, change a changeset in the repository! That is, in Git you can modify the history of the changes that have taken place. Of course, there are scenarios where this applies and there are restrictions, but this is another story.

How does Git save changes?

Outbound, it saves similarly to many other version control systems: When a changeset arrives, only the changeset files are saved. Files that were not changed remain there as they were. At each commit Git registers a snapshot , which is the state of the repository as it was after this commit.

As with other versioning systems, you can request the state of the repository at any point in the past, that is: you request a given snapshot. What Git will then give you are the files committed at the time of the snapshot's registration and all the files that were already there before, those that were not modified by the commit that gave rise to the snapshot.

No, Git does not only commit changes to each file in the commit act. At the commit commit, Git saves the entire contents of the file, not just the modifications made to the file .

Is it correct to say that Git is able to save only the differences between commits from the same file instead of having to keep the entire file even if only one row has been modified?

Yes, it is correct. In due course, Git will make a kind of garbage collection and, among other things, it will also erase some historical files by replacing them with only the changes that occurred in those files between one commit and another (< in> Delta Encoding ). You can also force this process when you want.

It is important to note that during garbage collection, Git does not replace the files of the new commits with their delta encoding but rather the reverse: it gets the changes from the most current state of the file to back, in order to deliver faster the latest version of the file (which is probably the one you will most often want).

Conclusion

Changeset or Changeset is a concept that deals with commit atomicity and is not directly related to the way Git saves files. Git is one of many versioning systems that use this concept of changeset.

During the commit, Git saves all the contents of the changed file , regardless of whether the file has been slightly modified (just a new line, for example).

Git does not need to save a copy of the repository with every commit to ensure the repository's availability in some past state. Instead, during the commit it registers a snapshot that points to the newly committed files and also to the current versions of the other files that were already there.

At appropriate times, Git rearranges its base in order to save space ( garbage collection ). During this reorganization, past versions of a file can be overwritten by records only of changes that the file underwent ( delta encoding ). So, when an older version is requested, Git rebuilds the file from its most current version, applying changes in it to the older version.

    
19.02.2016 / 16:33