Subversion considered obsolete
Subversion considered obsolete
Posted Apr 3, 2010 15:31 UTC (Sat) by smurf (subscriber, #17840)Parent article: A proposed Subversion vision and roadmap
WRT the 'corporate user': Subversion is unable to separate the steps "fix bug A" and "integrate with bugfixes B C D and E, having been checked in while you've been working on A". Mr. Corporate Manager wants this feature.
Subversion cannot do that unless you create a branch for everything. *Poof* your workflow is now an order of magnitude more complicated than git or hg (which is IMHO even less complicated than SVN, once you let go of the centralized-repo, poisoned-by-CVS mindset.)
Posted Apr 3, 2010 18:51 UTC (Sat)
by ballombe (subscriber, #9523)
[Link] (22 responses)
Posted Apr 3, 2010 19:20 UTC (Sat)
by cortana (subscriber, #24596)
[Link]
Posted Apr 3, 2010 19:28 UTC (Sat)
by wahern (subscriber, #37304)
[Link] (20 responses)
This isn't feasible with Git. I once let a git-svn clone run--over a LAN--for 2 days straight before giving up. Not just Git's fault, granted, but a problem all the same.
The Git solution is to use sub-modules. Whether this is better or worse is fairly debatable, but it's not practical to shift huge SVN trees over to Git.
Posted Apr 3, 2010 21:36 UTC (Sat)
by maks (guest, #32426)
[Link]
git svn is nice when you are stuck for whatever reason with svn and can commit to it and have allmost all power of git.
Posted Apr 4, 2010 0:55 UTC (Sun)
by jengelh (guest, #33263)
[Link]
I don't see git at fault, but how SVN is organized/utilized is the bottleneck. If you had to make one HTTP request for every changed file in every revision (the repo reaching approximately 2 million objects this year), you may not be done downloading linux-2.6.git in two days either. You can easily test that.. unpack the packs into separate objects, create the http metadata and let it clone. Then again, you could probably pull it off, given git does not have to calculate any diffs.
Posted Apr 4, 2010 3:38 UTC (Sun)
by nbd (subscriber, #14393)
[Link]
Posted Apr 4, 2010 6:59 UTC (Sun)
by smurf (subscriber, #17840)
[Link]
Importing into git can be _fast_ and is limited (on the git side) by the write speed of your disk, and your CPU's zip and sha computing.
On the other hand, last time I checked git-svn essentially does a HTTP request for every revision it pulls. That's hardly fast under the best circumstances. Sorry to say, I have zero interest in finding out whether this could be sped up.
Posted Apr 4, 2010 9:34 UTC (Sun)
by epa (subscriber, #39769)
[Link] (14 responses)
Posted Apr 4, 2010 11:16 UTC (Sun)
by peschmae (guest, #32292)
[Link] (13 responses)
Secondly the space used by git for storing the entire history is, in most cases, less than the space used by a working copy. i.e. for my linux kernel clone (with complete history going back 2.6.12) currently 400 MB for the .git directory as opposed to 450 MB for the actual checkout. Not really much of an issue in practice
Posted Apr 4, 2010 17:25 UTC (Sun)
by RCL (guest, #63264)
[Link] (11 responses)
SHA calculation is what kills it. As I wrote, I didn't succeed in creating
Posted Apr 4, 2010 20:33 UTC (Sun)
by smurf (subscriber, #17840)
[Link] (4 responses)
The larger problem, however, is that you want a way to carry multiple versions of slowly-changing multi-GB files in your repo -- without paying the storage price of (a compressed version of) the whole blob, each time you check in a single-byte change. Same for network traffic when sending that change to a remote repository.
This is essentially a solved problem (rsync does it all the time) and just needs integration into the VCS-of-the-day. This problem is quite orthogonal to the question of whether said VCS-of-the-day is distributed or central, or whether it is named git or hg or bzr or whatever.
Yes, I know that the SVN people seem to have gotten this one mostly-right ("mostly" because their copy of the original file is not compressed). Hopefully, somebody will do the work for git or hg or whatever. It's not exactly rocket science.
Posted Apr 6, 2010 18:47 UTC (Tue)
by vonbrand (subscriber, #4458)
[Link] (3 responses)
Posted Apr 6, 2010 23:12 UTC (Tue)
by dlang (guest, #313)
[Link] (2 responses)
even for images and audio, if you were to check them in uncompressed the git delta functionality would work well and diff the files against each other, but if you compress the file (jpeg, mp3, or even png) before checking it in, a small change to the uncompressed data results in a huge change to the compressed data. If it's a lossless compression (i.e. png) then it would be possible to have git uncompress it before checking for differences, but if it's a lossy compression you can't do this.
Posted Apr 7, 2010 7:53 UTC (Wed)
by paulj (subscriber, #341)
[Link] (1 responses)
Posted Apr 12, 2010 1:14 UTC (Mon)
by vonbrand (subscriber, #4458)
[Link]
Not really.
If the contents needs version control, it should be handled by a VCS. The size or format of the files could be a technical hurdle, sure; but it shouldn't be an excuse for not solving the problem.
Posted Apr 4, 2010 20:55 UTC (Sun)
by simlo (guest, #10866)
[Link] (4 responses)
Where I work, we use subversion (and I use git svn :-). We have scripts which pulls the tar.gz files for various packages in specific versions from a server, unpacks, patches and crosscompiles them to our target. The only thing we have in subversion is the scripts and the files we have changed.
For the Linux kernel we tried to have the full thing in subversion, but it took way too much for subversion, so now we only have a makefile, which clones a git repository, when the source is needed.
Posted Apr 5, 2010 3:12 UTC (Mon)
by RCL (guest, #63264)
[Link] (1 responses)
It's almost like having very large firmware blobs in Linux kernels, much larger than they are currently...
See comments below where I elaborate on this interdependency between code and data in games.
Posted Apr 5, 2010 14:00 UTC (Mon)
by simlo (guest, #10866)
[Link]
On the other hand you can have data like maps and icons, which is also "source code" and belongs in the VCS if you edit them by using some program (some map editor, GIMP or whatever). But the overall system is badly designed if these files are "large". They ought to be seperated into small files, each containing seperate parts of the information and then "compiled" into larger files. This will usually make a more flexible and maintainable system (besides making life easier for the VCS). It is the same with C code: You don't make one big file but smaller ones, seperated by functionality.
Posted Apr 5, 2010 3:22 UTC (Mon)
by martinfick (subscriber, #4455)
[Link]
Posted Apr 8, 2010 10:08 UTC (Thu)
by epa (subscriber, #39769)
[Link]
Posted Apr 5, 2010 12:12 UTC (Mon)
by peschmae (guest, #32292)
[Link]
Looks like you're stuck with perforce :-p
Posted Apr 5, 2010 17:21 UTC (Mon)
by chad.netzer (subscriber, #4257)
[Link]
But yes, in practice, git stores source code repos very compactly, and by doing branch operations in place (rather than using separate directory copies for each "branch"), it uses much less space per client than SVN checkouts for a busy developer. And its also *much* faster for the same reason.
Posted Apr 4, 2010 17:19 UTC (Sun)
by lacostej (guest, #2760)
[Link]
* git-svn taking time is SVN's fault, not git. git svn is a bridge that
* from my experience a full git history takes less space than the latest
To me, SVN is an outdated technology. Git is harder to learn though and
Note: hg is a compelling alternative and some people might want to look at
Posted Apr 5, 2010 15:57 UTC (Mon)
by Msemack (guest, #65001)
[Link] (18 responses)
1. Unmergable files (binaries) and file locking. This is big. A DVCS can't handle the output
2. Forcing people to keep Local copies of the entire repo falls apart above certain repo
Posted Apr 5, 2010 17:21 UTC (Mon)
by iabervon (subscriber, #722)
[Link] (8 responses)
Also, a DVCS could, in theory, know where to get all the big uninteresting files instead of actually storing them on the client. That is, for files that aren't useful to compare aside from identity, it would be perfectly reasonable for the DVCS to store on the client "hash xyz is available at (location)", and only actually get the content when needed. For that matter, a DVCS could store the content of large binary files in bittorrent (or a site-internal equivalent) and beat a centralized distribution point.
So far, we haven't seen any DVCSes that do either of these things, but there's not reason they couldn't, aside from the fact that there aren't developers who want to work on those particular problems. That is, version control programs are written in environments with merging and files that compress well against each other and other versions of the same file; this means that "eating your own dogfood" isn't sufficient to motivate developers of version controls systems to fix these problems, and centralized systems, by and large, tend to just happen to work for these cases by default or with very little design effort.
Posted Apr 5, 2010 17:46 UTC (Mon)
by Msemack (guest, #65001)
[Link] (1 responses)
Posted Apr 5, 2010 18:01 UTC (Mon)
by iabervon (subscriber, #722)
[Link]
Posted Apr 6, 2010 19:07 UTC (Tue)
by vonbrand (subscriber, #4458)
[Link] (5 responses)
Why all this locking nonsense? You have complete control over your clone of the central repo (as a bonus, nobody sees any dumb experiments you try). Use something like gitolite to provide finer-grained access to a shared repo.
And have a real person as a gatekeeper for changes. Call them QA or something.
Posted Apr 6, 2010 20:01 UTC (Tue)
by iabervon (subscriber, #722)
[Link] (4 responses)
It's only relevant to projects where merge conflicts can't be resolved (and if you get a merge conflict, someone's work has to be thrown away and redone from scratch), where you want the person whose work would be wasted to do something else or take the afternoon off instead of wasting their time, but these are important cases in a number of industries that aren't software development.
Posted Apr 7, 2010 1:10 UTC (Wed)
by vonbrand (subscriber, #4458)
[Link] (3 responses)
The git way to handle this is to have everybody work in their own area (no "I step on your toes" possible), and merge the finished products when the developer say it is done. As everybody can also freely take versions from anybody, this doesn't restrict work based on not-yet-accepted changes in any way (sanctioning a change as official is an administrative decision, as it should be, not one forced by the tool).
Posted Apr 7, 2010 2:13 UTC (Wed)
by iabervon (subscriber, #722)
[Link] (2 responses)
Posted Apr 7, 2010 20:21 UTC (Wed)
by vonbrand (subscriber, #4458)
[Link] (1 responses)
OK, but this is a problem that no VCS can solve (because there is no reasonable way to merge separate modifications). Locking doesn't help either, in any case this requires administrative (workflow) coordination between people.
Posted Apr 7, 2010 20:32 UTC (Wed)
by foom (subscriber, #14868)
[Link]
The advisory locking in SVN *is the implementation of* the administrative (workflow) coordination
Posted Apr 6, 2010 19:03 UTC (Tue)
by vonbrand (subscriber, #4458)
[Link] (8 responses)
Locking is not needed in git, all changes are atomic by design. Sure, if several people mess around commiting stuff at random into the same repo, caos could ensue (but "lock for a commit" won't help here anyway). A solution is to have a real person as a gatekeeper for changes. Or use something like gitolite to provide finer-grained access to a shared repo.
"Large files can't be handled"? That I don't see where it comes from. Sure, early
Posted Apr 6, 2010 23:16 UTC (Tue)
by dlang (guest, #313)
[Link] (3 responses)
git's pack file format uses 32 bit offsets, which limits the largest pack size to ~4G (I believe it uses unsigned 32 bit values, if it's signed values then the limit is 2G) I think that it is always pointing to the beginning of a file, so a file larger than 4G can exist in a pack, but would be the only thing in the pack.
Posted Apr 7, 2010 1:02 UTC (Wed)
by vonbrand (subscriber, #4458)
[Link] (1 responses)
Wrong. From
Posted Apr 7, 2010 3:23 UTC (Wed)
by dlang (guest, #313)
[Link]
so you can have up to 4G in a pack file, plus however much the last object runs off the end of it.
Posted Apr 7, 2010 22:56 UTC (Wed)
by cmccabe (guest, #60281)
[Link]
You can mmap(2) files that are bigger than your memory size.
Of course, if you're on 32-bit, there are some size limitations because of the limited size of virtual memory.
Posted Apr 7, 2010 20:21 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link] (3 responses)
What happens in many proprietary systems, and in subversion if you choose, is that you need a lock to be "authorised" to edit a file, not to commit the change. The procedure looks like:
1 The lockable files start off read-only
The VCS can't strictly _enforce_ the locking rule, of course you could copy file X, or set it read-write manually, then start working on it, and only try to take the lock a day later. But, when you complain to your boss that someone changed file X after you'd started work on it, of course he won't have much sympathy.
The locking model has lots of problems, but some people have convincing arguments for why its appropriate to their problem. The only options for Git are (1) not appealing to those users (2) persuading them all that they're wrong or (3) offering some weird hybrid mode where there can be a central locking server for a subset of files. If you accept that (1) could be the right option, then the continued existence of Subversion is justified for those users already.
Posted Apr 8, 2010 20:56 UTC (Thu)
by vonbrand (subscriber, #4458)
[Link] (2 responses)
In centralized systems (especially those that don't handle merges decently) this is really the only way to work, true. (RCS works this way, for example). But with decentralized systems with reasonable branching and merging this isn't required. And locking a file doesn't really make sense in a decentralized system (there is no single "file" to lock!), so it is left out of the tool. If the workflow requires some sort of "don't touch some file(s) for a while" synchronization, it has to be handled outside.
I believe this requirement's importance is way overblown. How many projects do you work on, where it is really a requirement (not an artifact of shortcommings in integrating changes by the tool)?
Posted Apr 8, 2010 21:56 UTC (Thu)
by foom (subscriber, #14868)
[Link] (1 responses)
Being a distributed VCS doesn't make this workflow management tool any less necessary. And it's convenient to have it integrated with the VCS so that "status" shows the status, and "commit" checks and releases the locks, and so on.
You might want to read this so you can know what you're talking about:
I wish I didn't have to keep defending subversion: git is really nice. But come on people, just be honest about what it can't do, and stop claiming those things are unnecessary!
Posted Apr 9, 2010 16:07 UTC (Fri)
by vonbrand (subscriber, #4458)
[Link]
As I said, the workflow might require it. But in a decentralized environment there simply can't be any "common rallying point" for all developers handled by the DVCS, so the tool itself can't help you here.
Not by a design flaw, but by fundamental reasons: The "locking" idea only makes sense if there is one master copy shared by all.
Subversion considered obsolete
Subversion considered obsolete
repository) take up less space than Subversion checkouts, which include no history at all.
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Then you can use that to speed up clones for working on. I have a small script for cloning a git tree then adding the necessary information to sync against the svn server it was cloned from: http://nbd.name/gsc
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
tiny fraction of overall repo) and ... git can't even handle that (see my
thread of comments below).
a 22GB (modest by gamedev standards) repo with git/bzr/hg...
Versioning really-big files
Versioning really-big (binary) files
git uses delta compression by default (and has done so for a long time now), so the "huge binary files that change a bit" shouldn't be a problem. Please check with the latest version.
Versioning really-big (binary) files
Versioning really-big (binary) files
into an SCM. Just archive them somewhere.
Versioning really-big (binary) files
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
trees. Maybe it isn't large enough for you. Seems to work for many.
will check out the revisions using the svn client, one by one. As said
later in the comments, keep the git-svn clone on the server, done once
people will clone from that one instead.
svn checkout. And you can fully work offline, do branches, etc.
people can (& do) make mistakes until they grasp properly the DVCS
concepts.
http://hginit.com/ for introduction
Subversion considered obsolete
of our CAD program.
sizes.
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
Subversion considered obsolete
between people.
Subversion considered obsolete
git versions did handle everything by keeping (compressed) copies of each file contents they came across, but that is long gone now.
Subversion considered obsolete
Size limit in git objects?
Documentation/tecnical/pack-format.txt for current git (version v1.7.0.4-361-g8b5fe8c):
Observation: length of each object is encoded in a variable
length format and is not constrained to 32-bit or anything.
Size limit in git objects?
Subversion considered obsolete
> limits the file size to your memory size.
Subversion considered obsolete
2. You _tell the VCS_ that you want to work on file X
2.1 The VCS contacts a central server, asking for a lock on file X
2.2 If that's granted file X is set read-write
2.3 Optionally your program for working on file X is started automatically
3. You do some work, maybe over the course of hours or even days
4. You save and check in the new file X, optionally releasing the lock
Subversion considered obsolete
Subversion considered obsolete
http://svnbook.red-bean.com/en/1.5/svn.advanced.locking.html
Subversion considered obsolete
