www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - git workflow for D

reply bitwise <bitwise.pvt gmail.com> writes:
I've finally started learning git, due to our team expanding 
beyond one person - awesome, right? Anyways, I've got things more 
or less figured out, which is nice, because being clueless about 
git is a big blocker for me trying to do any real work on 
dmd/phobos/druntime. As far as working on a single master branch 
works, I can commit, rebase, merge, squash, push, reset, etc, 
like the best of em. What I'm confused about is how all this 
stuff will interact when working on a forked repo and trying to 
maintain pull requests while everyone else's commits flood in.

How does one keep their fork up to date? For example, if I fork 
dmd, and wait a month, do I just fetch using dmd's master as a 
remote, and then rebase? Will that actually work, or is that 
impossible across separate forks/branches? What if I have 
committed and pushed to my remote fork and still want to merge in 
the latest changes from dlang's master branch?

And how does a pull request actually work? Is it a request to 
merge my entire branch, or just some specific files? and do I 
need a separate branch for each pull request, or is the pull 
request itself somehow isolated from my changes?

Anyways, I'd just be rambling if I kept asking questions. If 
anyone can offer any kind of advice, or an article that explains 
these things concisely and effectively, that would be helpful.

     Thanks
Dec 03 2017
next sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Sunday, 3 December 2017 at 20:05:47 UTC, bitwise wrote:
 I've finally started learning git, due to our team expanding 
 beyond one person - awesome, right? Anyways, I've got things 
 more or less figured out, which is nice, because being clueless 
 about git is a big blocker for me trying to do any real work on 
 dmd/phobos/druntime. As far as working on a single master 
 branch works, I can commit, rebase, merge, squash, push, reset, 
 etc, like the best of em. What I'm confused about is how all 
 this stuff will interact when working on a forked repo and 
 trying to maintain pull requests while everyone else's commits 
 flood in.

 How does one keep their fork up to date?
Just push to your fork/master after pulling from the (shared) origin/master.
 For example, if I fork dmd, and wait a month, do I just fetch 
 using dmd's master as a remote, and then rebase? Will that 
 actually work, or is that impossible across separate 
 forks/branches? What if I have committed and pushed to my 
 remote fork and still want to merge in the latest changes from 
 dlang's master branch?
In a non personal project you NEVER commit to master. You make each single fucking change in a specific branch (sorry for the language, it's intentionally gross). the master branch is only updated when you pull from the origin.
 And how does a pull request actually work? Is it a request to 
 merge my entire branch,
Yes it's a merge. Optionally all the commits can be squashed.
 or just some specific files? and do I need a separate branch 
 for each pull request,
Yes, yes yes, again. ~master is sacrosanct.
 or is the pull request itself somehow isolated from my changes?

 Anyways, I'd just be rambling if I kept asking questions. If 
 anyone can offer any kind of advice, or an article that 
 explains these things concisely and effectively, that would be 
 helpful.

     Thanks
The only article i've ever read about git was when the first time i needed to squash (actually i rather do "fixup" 99% of the time...) so i have nothing else to add.
Dec 03 2017
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 12/3/17 3:48 PM, Basile B. wrote:
 On Sunday, 3 December 2017 at 20:05:47 UTC, bitwise wrote:
 or just some specific files? and do I need a separate branch for each 
 pull request,
Yes, yes yes, again. ~master is sacrosanct.
For good reason. If you commit things to your master, and they don't get merged into the mainline master, now you have a borked master, and you have to reset it. One other thing to mention, whenever I update my master from the upstream fork, I always always specify --ff-only, which means I'm only going to let you merge if both my master and the upstream master are exactly the same (i.e. my master is just an earlier version of upstream master). Even though I never *intentionally* commit to master, this is a sanity check to make sure I didn't *accidentally* do it. If you have to reset master, then I recommend reading articles. I always use this website for any git questions: https://git-scm.com/book/en/v2 Or just do a search on google. You will probably find the answers in stack-overflow. -Steve
Dec 04 2017
prev sibling next sibling parent Mengu <mengukagan gmail.com> writes:
On Sunday, 3 December 2017 at 20:05:47 UTC, bitwise wrote:
 I've finally started learning git, due to our team expanding 
 beyond one person - awesome, right? Anyways, I've got things 
 more or less figured out, which is nice, because being clueless 
 about git is a big blocker for me trying to do any real work on 
 dmd/phobos/druntime. As far as working on a single master 
 branch works, I can commit, rebase, merge, squash, push, reset, 
 etc, like the best of em. What I'm confused about is how all 
 this stuff will interact when working on a forked repo and 
 trying to maintain pull requests while everyone else's commits 
 flood in.

 How does one keep their fork up to date? For example, if I fork 
 dmd, and wait a month, do I just fetch using dmd's master as a 
 remote, and then rebase? Will that actually work, or is that 
 impossible across separate forks/branches? What if I have 
 committed and pushed to my remote fork and still want to merge 
 in the latest changes from dlang's master branch?
you can fork it, set dmd/master as upstream and then git fetch upstream. you can then rebase.
 And how does a pull request actually work? Is it a request to 
 merge my entire branch, or just some specific files? and do I 
 need a separate branch for each pull request, or is the pull 
 request itself somehow isolated from my changes?
commits can be cherrypick-ed or you can request your entire branch to be merged. it doesn't always have to be the master branch. for example, if there's std.experimental.logger branch, you can ask for your branch to be merged with that. having a seperate branch for each feature is most of the time the way to go. makes it cleaner for yourself. later on you can delete those merged branches.
 Anyways, I'd just be rambling if I kept asking questions. If 
 anyone can offer any kind of advice, or an article that 
 explains these things concisely and effectively, that would be 
 helpful.

     Thanks
Dec 03 2017
prev sibling next sibling parent reply Arun Chandrasekaran <aruncxy gmail.com> writes:
Git CLI is arcane and esoteric. I've lost my commits before 
(yeah, my mistake). Since then I always access git via mercurial. 
In comparison Mercurial is far better a VCS tool.
Dec 03 2017
parent reply Basile B. <b2.temp gmx.com> writes:
On Sunday, 3 December 2017 at 22:22:47 UTC, Arun Chandrasekaran 
wrote:
 Git CLI is arcane and esoteric. I've lost my commits before 
 (yeah, my mistake).
Who hasn't ;) Happened to me last time because i tried a command supposed to remove untracked files in submodules...but used "reset" in a wrong way... ouch.
 Since then I always access git via mercurial. In comparison 
 Mercurial is far better a VCS tool.
I use git gui (i find other GUIs slower even if nicer) but even...there are few commands to know: - checkout /*select a branch or a commit in the history*/ - commit /*validate changes*/ - pull /*get upstream changes*/ - push /*send upstream changes*/ - rebase /*squash - fixup*/ - stash /*backup before pull in case of...*/ you can live with that.
Dec 03 2017
next sibling parent reply Arun Chandrasekaran <aruncxy gmail.com> writes:
On Sunday, 3 December 2017 at 23:39:49 UTC, Basile B. wrote:
 On Sunday, 3 December 2017 at 22:22:47 UTC, Arun Chandrasekaran 
 wrote:
 Git CLI is arcane and esoteric. I've lost my commits before 
 (yeah, my mistake).
Who hasn't ;) Happened to me last time because i tried a command supposed to remove untracked files in submodules...but used "reset" in a wrong way... ouch.
If you still lose changes, you could try using Mercurial with hggit. It can be a bit slow, but not destructive as git itself. ;) I really wish Mercurial won instead of git. Now that hg evolve and hg topic are stable, that actually alleviates the need for git. But the world talks git now. So everyone else is forced to talk in git :( I guess, without StackOverflow and GitHub, no one would be using git. Facebook uses Mercurial and their team is working on a Mercurial server in Rust. https://github.com/facebookexperimental/mononoke I thought Facebook uses DLang as well. No one's motivated to write one in DLang?
Dec 03 2017
parent reply Arun Chandrasekaran <aruncxy gmail.com> writes:
On Monday, 4 December 2017 at 01:26:45 UTC, Arun Chandrasekaran 
wrote:
 On Sunday, 3 December 2017 at 23:39:49 UTC, Basile B. wrote:
 [...]
If you still lose changes, you could try using Mercurial with hggit. It can be a bit slow, but not destructive as git itself. ;) I really wish Mercurial won instead of git. Now that hg evolve and hg topic are stable, that actually alleviates the need for git. But the world talks git now. So everyone else is forced to talk in git :( I guess, without StackOverflow and GitHub, no one would be using git. Facebook uses Mercurial and their team is working on a Mercurial server in Rust. https://github.com/facebookexperimental/mononoke I thought Facebook uses DLang as well. No one's motivated to write one in DLang?
Looks like Mercurial is going to be rewritten in Rust https://www.mercurial-scm.org/wiki/OxidationPlan So Facebook don't use D?
Dec 05 2017
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, December 06, 2017 04:56:17 Arun Chandrasekaran via 
Digitalmars-d-learn wrote:
 Looks like Mercurial is going to be rewritten in Rust
 https://www.mercurial-scm.org/wiki/OxidationPlan

 So Facebook don't use D?
As I understand it, the main languages at Facebook are C++ and PHP, but they use a variety of languages depending on who's doing the work and what part of Facebook they're in. Some of them were using D while Andrei was working there. I don't know if they're using it now or not. But even if some of them are still using D, others could be using Rust or whatever took their fancy and their bosses let them use. IIRC, some of the Facebook guys in London were even using Haskell. - Jonathan M Davis
Dec 05 2017
prev sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Basile B. wrote:

 On Sunday, 3 December 2017 at 22:22:47 UTC, Arun Chandrasekaran wrote:
 Git CLI is arcane and esoteric. I've lost my commits before (yeah, my 
 mistake).
Who hasn't ;)
me.
 Happened to me last time because i tried a command supposed to remove 
 untracked files in submodules...but used "reset" in a wrong way... ouch.
"git reflog". nothing commited is *ever* lost until you do "git gc". git sometimes does GC on its own, so you can turn it off with: git config --global gc.auto 0 don't forget to manually GC your repo then with "git gc", or it may grow quite huge.
Dec 03 2017
parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Monday, 4 December 2017 at 01:54:57 UTC, ketmar wrote:
 Basile B. wrote:

 On Sunday, 3 December 2017 at 22:22:47 UTC, Arun 
 Chandrasekaran wrote:
 Git CLI is arcane and esoteric. I've lost my commits before 
 (yeah, my mistake).
Who hasn't ;)
me.
 Happened to me last time because i tried a command supposed to 
 remove untracked files in submodules...but used "reset" in a 
 wrong way... ouch.
"git reflog". nothing commited is *ever* lost until you do "git gc". This needs to be repeated: nothing in git is ever lost if it had been commited. You can lose untracked files, but commits do not disappear. If you're unsure before an operation and have difficulties to use git reflog. Before doing the operation, do a simple git branch life-draft (or whatever you want). After the operation if it failed, you still have the commit your HEAD was on referenced by the life-draft branch. branches and tags are just pointers in the directed graph a git repositery is. The interface only does not display the branches that have no entry pointer. git sometimes does GC on its own, so you can turn it off
 with:

 	git config --global gc.auto 0

 don't forget to manually GC your repo then with "git gc", or it 
 may grow quite huge.
Dec 03 2017
parent Basile B. <b2.temp gmx.com> writes:
On Monday, 4 December 2017 at 04:45:01 UTC, Patrick Schluter 
wrote:
 On Monday, 4 December 2017 at 01:54:57 UTC, ketmar wrote:
 Basile B. wrote:

 On Sunday, 3 December 2017 at 22:22:47 UTC, Arun 
 Chandrasekaran wrote:
 Git CLI is arcane and esoteric. I've lost my commits before 
 (yeah, my mistake).
Who hasn't ;)
me.
 Happened to me last time because i tried a command supposed 
 to remove untracked files in submodules...but used "reset" in 
 a wrong way... ouch.
"git reflog". nothing commited is *ever* lost until you do "git gc".
Actually (and i reply to ketmar too) i've never lost commits either. What 's happened to me is that i lost non-validated stuff in the staging area b/c as you noticed otherwise it would be recoverable. Now i use "git submodule foreach git clean -f" to clean submodules.
Dec 04 2017
prev sibling next sibling parent reply "Nick Sabalausky (Abscissa)" <SeeWebsiteToContactMe semitwist.com> writes:
On 12/03/2017 03:05 PM, bitwise wrote:
 I've finally started learning git, due to our team expanding beyond one 
 person - awesome, right? 
PROTIP: Version control systems (no matter whether you use git, subversion, or whatever), are VERY helpful on single-person projects, too! Highly recommended! (Or even any time you have a directory tree where you might want to enable undo/redo/magic-time-machine on!)
 Anyways, I've got things more or less figured 
 out, which is nice, because being clueless about git is a big blocker 
 for me trying to do any real work on dmd/phobos/druntime. As far as 
 working on a single master branch works, I can commit, rebase, merge, 
 squash, push, reset, etc, like the best of em.
Congrats! Like Arun mentioned, git's CLI can be a royal mess. I've heard it be compared to driving a car by crawling under the hood and pulling on wires - and I agree. But it's VERY helpful stuff to know, and the closer you get to understanding it inside and out, the better off you are. (And I admit, I still have a long ways to go myself.)
 What I'm confused about 
 is how all this stuff will interact when working on a forked repo and 
 trying to maintain pull requests while everyone else's commits flood in.
Yea. It's fundamental stuff, but it can be frustratingly convoluted for the uninitiated. TBH, I really wish we had widespread tools that cater to what have emerged as the most common best-practice workflows and the basic primitives of those workflows. I even went so far as to get started on such a tool (I call it "wit"), but it's been kind of on the back burner lately, and it's still far too embryonic for any kind of release. (I am still using a git repo for it locally though! Again, highly recommeded. At the very least, just because there's nothing worse than accidentally loosing a bunch of important code, or finding you need to undo a bunch of changes that didn't work out.) One thing to keep in mind: Any time you're talking about moving anything from one repo to another, there's exactly two basic primitives there: push and pull. Both of them are basically the same simple thing: All they're about is copying the latest new commits (or tags) from WW branch on XX repo, to YY branch on ZZ repo. All other git commands that move anything bewteen repos start out with this basic "push" or "pull" primitive. (Engh, technically "fetch" is even more of a primitive than those, but I find it more helpful to think in terms of "push/pull" for the most typical daily tasks.)
 How does one keep their fork up to date? For example, if I fork dmd, and 
 wait a month, do I just fetch using dmd's master as a remote, and then 
 rebase?
Yes. "pull" from the official ~master to your local repo's ~master, if necessary rebasing your changes on top of the new ~master. Although generally, you shouldn't have much (or any) changes in your own ~master branch, so typically the rebase part really shouldn't be needed at all, since you shouldnt have any local changes on ~master which would need to be rebased (this ideal is a situation git refers to as "fast-forward"). Unless, of course, you happen to be futzing around making some changes and making your commits to your ~master (which you're not *supposed* to be doing anyway - standard best practice is to do all your work within a branch). In this case you probably will need to rebase. (The other alternative to rebasing is always a merge, but in the specific situation you're describing, rebase is definitely cleaner and will lead to less problems).
 Will that actually work,
Yes.
 or is that impossible across separate 
 forks/branches?
Totally possible. In fact, that's exactly the sort of stuff git is designed to handle.
 What if I have committed and pushed to my remote fork 
 and still want to merge in the latest changes from dlang's master branch?
 
You pretty much already got this right. First, you do just as you said above: 1. Pull from the official repo's ~master to your local repo's ~master (rebasing, if necessary, any commits you may have on your local ~master. Although, like Bastile said, if you're making local commits to ~master then *technically* "you're doing it wrong"). 2. And then, after you've pulled (and maybe rebased) the lastest official updates to your local machine...maybe then you added some more commits of your own...then you can push it all from your local machine's clone to your remote fork. Think of your remote github fork as the "published" copy of your local repo. It should exactly mirror your local repo (minus whatever branches/commits you're not yet ready to unleash upon the world): You do your work locally, and whenever you want to "publish" or "make public" your local work, you push it from your local repo to your remote github fork. That's all your remote github fork is: A copy of your whatever parts of your local repo that you've chosen to publish.
 And how does a pull request actually work?
Those two steps I outlined above? A pull request is the step 3: 3. After you've done steps 1 and 2 above, go to your remote fork on github.com, go to whatever branch you were making your changes on (you *were* making all your changes in a separate branch and not ~master, right? Because otherwise, doing another PR before your last one is merged becomes a big PITA - that's why people are so adamant about doing all your work in separate branches even though it's honestly a bit of an administrative pain to make a branch every time there's a change you're brewing in your mind.) Then, hit the big button to make a PR out of your branch. The PR tells the others you want them to pull the commits from your branch on your github remote repo, into their ~master.
 Is it a request to merge my 
 entire branch, or just some specific files?
The entire branch. Or specifically, the commits in your branch. And 99.9% of the time, the way it works is, your PR proposes they merge your "feature-foobar" or "issue 977734" branch into THEIR ~master. And then when they approve and merge your PR - there it goes - your commits get copied from your branch to their repo's ~master.
 and do I need a separate 
 branch for each pull request, or is the pull request itself somehow 
 isolated from my changes?
You *should* create a separate branch for each pull request unless you're a masochist. There's *no* isolation other than whatever isolation YOU create. (Not my idea of award-winning software design, but meh, it is what it is). This is why people are adamant about making a separate branch for each pull request. *Technically* speaking you don't absolutely HAVE to...But if you *don't* create a separate branch for each PR, you're just asking for pain: It'll be a PITA if you want to create another PR before your first one is approved and merged. And it'll be a PITA if your PR is rejected and you want to do any more work on the codebase. Think of your local ~master as being a direct mirror of the official repo. Do your work in separate branches (each to be converted into individual PRs), and keep your ~master as your copy of "this is the current state of the official project".
 If anyone can
 offer any kind of advice, or an article that explains these things
 concisely and effectively, that would be helpful.
You asked about keeping your fork up-to-date. Like I mentioned before, typically your fork should have these branches: ~master: Should be an exact copy of what the official project's ~master branch looked like last time you updated. Starts with ~master and then adds additional commits on top. foobar-feature: Your work that's going to be your PR to add feature foobar. Starts with ~master and then adds additional commits on top. So, you've got it like that, and your PRs are taking awhile to develop. In the meantime, the official project receives a lot of updates, which makes your local clone out-of-date. So what do you do? 1. Pull the lastest changes from the official repo's ~master to your local ~master. This should be trivial, and git calls it a "fast-forward". After all, your ~master is identical to the official one, except you're just missing the newest commits. 2. Pull the lastest changes from the official repo's ~master to each of your local branches: issue95423 and foobar-feature. For these, you'll need to rebase your changes on top of what you're pulling. There might be conflicts. If so, you'll need to resolve them. That's it. Then when you're satisfied with your changes: 3. Publish, by pushing your local issue95423 or foobar-feature branch to a same-named branch on your github remote repo (have git create this new branch if it doesn't already exist on your github remote repo). 4. Go onto github.com and create a PR out of the issue95423 or foobar-feature branch of your github remote repo. 5. Optionally, add more commits to your local branches, push them to your github remote repo, and when you do, your PR will automatically be updated with the new commits.
Dec 04 2017
next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Monday, 4 December 2017 at 11:51:42 UTC, Nick Sabalausky 
(Abscissa) wrote:
 On 12/03/2017 03:05 PM, bitwise wrote:

 One thing to keep in mind: Any time you're talking about moving 
 anything from one repo to another, there's exactly two basic 
 primitives there: push and pull. Both of them are basically the 
 same simple thing: All they're about is copying the latest new 
 commits (or tags) from WW branch on XX repo, to YY branch on ZZ 
 repo. All other git commands that move anything bewteen repos 
 start out with this basic "push" or "pull" primitive. (Engh, 
 technically "fetch" is even more of a primitive than those, but 
 I find it more helpful to think in terms of "push/pull" for the 
 most typical daily tasks.)
No, the pair us push/fetch. pull is fetch+merge and a lot of confusion comes from that in fact. I've seen several people cursing git because of that idea that pull is the opposite of push. When I explained that they should never use git pull, but always separating fetch from the merge, it clicked every time. So, avoid pull, look first what fetch does and if that is what you thought it would do, do the merge and be happy.
Dec 04 2017
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 12/04/2017 09:38 AM, Patrick Schluter wrote:

 So, avoid pull, look first what fetch does and if that is what you
 thought it would do, do the merge and be happy.
+1 Paraphrasing someone I trust very much, "Never 'pull', always 'fetch -p' and then rebase." Ali
Dec 04 2017
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, December 04, 2017 12:02:37 Ali Çehreli via Digitalmars-d-learn 
wrote:
 On 12/04/2017 09:38 AM, Patrick Schluter wrote:
  > So, avoid pull, look first what fetch does and if that is what you
  > thought it would do, do the merge and be happy.

 +1

 Paraphrasing someone I trust very much, "Never 'pull', always 'fetch -p'
 and then rebase."
I use pull all the time, but it's always pulling master from upstream, and I never make any changes to my local master; all my changes go on branches. Using github or gitlab, I really don't have any other reason to pull or fetch, because the only place that I normally care about getting code from is upstream master. Any code being merged from someone else is merged into upstream master via github/gitlab, and my code is all done on separate branches that get pushed up to github/gitlab to be merged. With different workflows (like sharing work directly with someone rather than using github or gitlab), I could see reasons to be wary of pull, but in the typical workflow with github/gitlab, I really don't see any reason to be wary of it - not when the only time it's needed is to sync my local master with the main one on github/gitlab. - Jonathan M Davis
Dec 04 2017
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 04, 2017 at 12:02:37PM -0800, Ali Çehreli via Digitalmars-d-learn
wrote:
[...]
 Paraphrasing someone I trust very much, "Never 'pull', always 'fetch
 -p' and then rebase."
I always use `git pull --ff-only`. Lets me pull when it's "safe", aborts if it will end up in a mess (i.e., tell me when I've made a boo-boo and committed to master). Usually, I only make changes in a local branch, so it's just a matter of rebasing the local branch afterwards. T -- Right now I'm having amnesia and deja vu at the same time. I think I've forgotten this before.
Dec 05 2017
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Dec 04, 2017 at 06:51:42AM -0500, Nick Sabalausky (Abscissa) via
Digitalmars-d-learn wrote:
 On 12/03/2017 03:05 PM, bitwise wrote:
 I've finally started learning git, due to our team expanding beyond
 one person - awesome, right?
PROTIP: Version control systems (no matter whether you use git, subversion, or whatever), are VERY helpful on single-person projects, too! Highly recommended! (Or even any time you have a directory tree where you might want to enable undo/redo/magic-time-machine on!)
+100! (and by '!' I mean 'factorial'. :-P) I've been using version control for all my personal projects, and I cannot tell you how many times it has saved me from my own stupidity (i.e., have to rollback a whole bunch of changes, or just plain ole consult an older version of the code that I've forgotten). Esp. with git, it also lets me play with experimental code changes without ever worrying that if things don't work out I might have to revert everything by hand (not fun! and very error-prone). In fact, I use version control for more than just code: *anything* that's text-based is highly recommended to be put under version control if you're doing any serious amount of editing with it, because it's just such a life-saver. Of course, git works with binaries too, but diffing and such become a lot easier if everything is text-based. This is why I always prefer text-based file formats when it comes to authoring. Websites are a good example that really ought to be under version control. Git, especially, lets you clone the website to a testing server where you can experiment with changes without fear, and once you're happy with the changes, commit and push to the "real" web server. Notice an embarrassing mistake that isn't easy to fix? No problem, just git checkout HEAD^, and that buys you the time you need to fix the problem locally, then re-push. I've also recently started putting certain subdirectories under /etc in git. Another life-saver when you screw up a configuration accidentally and need to revert to the last-known good config. Also good for troubleshooting to see exactly what changes were made that led to the current state of things. tl;dr: use version control WHEREVER you can, even for personal 1-man projects, not only for code, but for *everything* that involves a lot of changes over time.
 Anyways, I've got things more or less figured out, which is nice,
 because being clueless about git is a big blocker for me trying to
 do any real work on dmd/phobos/druntime. As far as working on a
 single master branch works, I can commit, rebase, merge, squash,
 push, reset, etc, like the best of em.
Congrats! Like Arun mentioned, git's CLI can be a royal mess. I've heard it be compared to driving a car by crawling under the hood and pulling on wires - and I agree. But it's VERY helpful stuff to know, and the closer you get to understanding it inside and out, the better off you are. (And I admit, I still have a long ways to go myself.)
Here's the thing: in order to use git effectively, you have to forget all the traditional notions of version control. Yes, git does use many of the common VC terminology, and, on the surface, does work in similar ways. BUT. You will never be able to avoid problems and unexpected behaviours unless you forget all the traditional VC notions, and begin to think in terms of GRAPHS. Because that's what git is: a system for managing a graph. To be precise, a directed acyclic graph (DAG). Roughly speaking, a git repo is just a graph (a DAG) of commits, objects, and refs. Objects are the stuff you're tracking, like files and stuff. Commits are sets of files (objects) that are considered to be part of a changeset. Refs are just pointers to certain nodes in the graph. A git 'branch' is nothing but a pointer to some node in the DAG. In git, a 'branch' in the traditional sense is not a first-class entity; what git calls a "branch" is nothing but a node pointer. The traditional "branch" is merely a particular configuration of nodes in the DAG that has no special significance to git. Git maintains a notion of the 'current branch', i.e., which pointer will serve as the location where new nodes will be added to the DAG. By default, this is the 'master' branch (i.e., a pointer named 'master' pointing to some node in the DAG). When you run `git commit`, what you're doing is creating a new node in the DAG, with the parent pointer set to the current branch pointer. So if the current branch is 'master', and it's pointing to the node with SHA hash 012345, then `git commit` will create a new node with its parent pointer set to 012345. After this node is added to the graph, the current pointer, 'master', is updated to point to the new node. By performing a series of `git commit`s, what you end up with is a linear chain of nodes, with the current branch ('master') pointing to the last node. This, we traditionally view as a "branch", but in git, there is nothing special at all about this chain; it's just a (sub)graph of some nodes. The git 'branch' is nothing but a pointer to the last of these nodes. You can easily make this pointer point to something else -- you wouldn't normally do this, but sometimes it can be useful. You can also decide that instead of adding new nodes to 'master', you want to add new nodes elsewhere in the DAG. No problem, just `git checkout` some arbitrary node, and start running `git commit` on it. The first new commit will take that node as parent, and thereby start creating a new chain of nodes "branching off" the 'master' chain. Merging a branch in git is likewise not something you'd think of in traditional VC terms; it's basically nothing but creating a new node with two parents, one from the tip of each respective branch. You can 'merge' any two arbitrary nodes together. Though of course, in general you'll end up with a huge number of conflicts if the node contents aren't correlated with each other -- but git doesn't actually mind that; you can actually overwrite all the contents with something else altogether and commit that, and git will happily take that as the "merge" of the two unrelated branches. The resulting graph won't make any sense in terms of revision history in the traditional VC sense, but git doesn't care. The point is that as far as git is concerned, it's all just a DAG. The fact that the contents of two adjacent nodes happen to be similar is just a "coincidence", albeit a usual one. The more 'arcane' git operations like rebasing, history rewriting, etc., are at the end of the day nothing more than graph operations, updating a bunch of pointers and moving nodes around. If you begin thinking of your repo as a graph and forget traditional VC notions of branches, you'll find that git suddenly starts to "makes sense", and you'll be able to do amazing things to your repo without losing your way. [...]
 ([...] there's nothing worse than accidentally loosing a bunch of
 important code, or finding you need to undo a bunch of changes that
 didn't work out.)
If you think in terms of graphs, you'll hardly ever need to worry about losing changes. Just think in terms of code: if you were given a bunch of pointers to nodes in a graph, and you need to update these pointers, what's the safest way to do it? Easy: just save the pointers to some local variables, then do whatever updates you want, and if it doesn't work out, just overwrite the pointers with the saved values, and you're back to where you started. In git, because everything is SHA-hashed, nodes are actually immutable. Even the so-called history rewriting, technically speaking, isn't really "rewriting"; it's actually creating a NEW subgraph that just happens to be similar to the older part of the graph plus some changes, and updating your refs (pointers) to point to nodes in the new part of the graph instead. In git, nodes that have nothing pointing to them are considered garbage; `git gc` will delete them from the graph. So once all your pointers are pointing to the new nodes, you've effectively discarded the old nodes; hence the overall effect is "rewriting" the graph. But if you still keep a ref to the old nodes, they will still be there; nothing is be lost. It's like dealing with immutable values in D: you can never change them, but you *can* make (modified) copies of them and changing your pointers to point to the copies instead of the original values. As long as you still keep refs to the old nodes, they will never be lost no matter what you do to your graph. And note that the parent pointers in each node are also part of the SHA hash, so the topology of the old part of the graph is immutable too. There is literally nothing you can do that can change the content or topology of those old nodes. As long as you have a way to reach them, you will still have your old history completely intact. And how do you create backup copies of your pointers? Easy: remember a git 'branch' is nothing but a pointer? Well, so you just go `git checkout <branch>; git checkout -b backup_ref` and now you have a pointer called 'backup_ref' that points to that same node that <branch> is pointing to. Now you can do whatever you want to <branch> -- add new commits, overwrite it with a ref to a completely different node, whatever. If at any point you decide that you want it to point to the original node again, just `git checkout <branch>; git reset --hard backup_ref`. As long as you don't touch backup_ref, you will be able to go back to the original state. (See? This is why you have to stop thinking of a git repo in traditional VC terms. Your git repo is a graph. (With immutable nodes.) That's all there is to it.)
 One thing to keep in mind: Any time you're talking about moving
 anything from one repo to another, there's exactly two basic
 primitives there: push and pull. Both of them are basically the same
 simple thing: All they're about is copying the latest new commits (or
 tags) from WW branch on XX repo, to YY branch on ZZ repo. All other
 git commands that move anything bewteen repos start out with this
 basic "push" or "pull" primitive. (Engh, technically "fetch" is even
 more of a primitive than those, but I find it more helpful to think in
 terms of "push/pull" for the most typical daily tasks.)
Again, this will all make so much more sense if you think in terms of graphs. What `git fetch` does is to download a bunch of nodes from a remote source. Don't even think in terms of branches; think in terms of individual nodes (which imply their own graph connectivity structure -- because the parent pointers are an immutable part of them) that are downloaded from the remote source. After downloading these nodes, git will create a new pointer (i.e., ref) to point to the last node (i.e., the node from which the other nodes can be reached), usually with a name like upstream/somebranch. There is nothing special about this name besides the convention that we use names of the form x/y for pointers named 'y' that we downloaded from 'x'; it's just a pointer to some nodes that you downloaded off the 'net. What 'git pull' does is to try to reconcile these downloaded nodes with the nodes in your local branch -- and here is where wrinkles can arise, because, by convention, git will try to merge the nodes from x/y into the local branch called y. It's all good if the local branch y points to an ancestor of x/y, i.e., your local branch is just a subgraph of the remote branch, and since the parent pointers of the downloaded nodes already point to y (i.e., they are already a part of the graph! -- because they share an ancestor node), the only thing that's needed is to update y to point to x/y (i.e., the new tip of the branch) instead. This is called 'fast-forwarding'. But what if your local branch has diverged from the remote branch? I.e., the nodes in local branch 'y' share a common ancestor with the downloaded nodes in x/y, but have different descendent nodes. Now we cannot simply set y to x/y, because that would cause you to lose your pointer to your local nodes, which means `git gc` will garbage-collect them (i.e., your local changes will be lost). So git tries to be 'helpful' here by attempting to merge the nodes together -- i.e., create a new series of nodes that incorporate the changes from *both* y and x/y. Unfortunately, this process often causes further problems, because remember, nodes are immutable, so the only way you can merge the changesets together is by creating new nodes ("merge commits" in git parlance) and discarding the old ones. But once you do that, your local branch 'y' is no longer the same as the remote one, so when it comes time to push your changes to other collaborators, or to pull from remote again later, it causes more conflicts in a never-ending spiral. The best approach is to avoid this situation altogether, by designating certain branches (usually master) as pull-only, i.e., you never commit changes to them, all your changes are committed to local branches. In terms of graphs, you never change the value of the 'master' pointer, but may add new nodes to the graph by using other pointers ("local branches") for that purpose. Then `git pull` will always be fast-forward only (the value of the local 'master' pointer will always be equal to, or an ancestor of, the remote 'master' pointer, so it is always possible to just replace the local 'master' pointer with the remote value without losing any nodes). This is why I recommend to *always* run: git pull --ff-only upstream master The --ff-only tells git not to try to be smart and create a mess of merge commits, but to only ever fast-forward the master pointer. If this fails, then you know you've made a mistake and updated the master pointer where you should have used a local branch instead. (How to fix this is left as an exercise for the reader: hint, remember 'master' is just a pointer. Just create a new local branch to point to the current nodes, i.e., backup your pointer, then reset 'master' to the last common ancestor with the upstream nodes, then `git pull`, and rebase your local branch afterwards.)
 How does one keep their fork up to date? For example, if I fork dmd,
 and wait a month, do I just fetch using dmd's master as a remote,
 and then rebase?
If you keep to the convention of never committing to master locally, then you can just `git pull --ff-only upstream master` and it will pull in the latest changes. Then you just rebase your local branch(es) on top of master. In graph-centric terms, running `git rebase master` in a local branch B does the following: (1) find the common ancestor A of master and B; (2) for each node in B up to (but not including) A, create a corresponding new node that contains the same changes, but is based on the tip of master instead of A; (3) set B to point to the last of the new nodes. Special note: since rebase isn't actually modifying nodes -- remember nodes are immutable -- if you're unsure or want to be extra-careful, you can keep a spare reference to the old tip of B before running the rebase, like this: git checkout B If you then run `git log --graph --all`, you'll see that there are now *two* copies of the commits you made in B: one in the original position branching off master at ancestor A, and the other is now based on master. 'B' will now point to the new nodes, but you'll still be able to access the old nodes via 'B-backup'. If at any time you wish to 'undo' the rebase, just reset B to B-backup. (The new nodes will then become unreferenced, and will be garbage-collected. Unless you kept another pointer to them, of course.) See? No danger of data loss. (Unless you forget to keep a spare pointer to your old nodes. But even in that case, there's still a way out with `git reflog` -- git gc doesn't actually delete nodes until they're past a certain age, so as long as you notice the problem early and not a week or month later, your old nodes will still be there. You just have to dig through `git reflog` to find the old pointer values, i.e., SHA hashes. Once you find the right SHA hash, just `git checkout <hash>` to go back to the old node, then `git checkout -b <oldbranch>` to create a new branch pointer to point to the old nodes.) [...]
 and do I need a separate branch for each pull request, or is the
 pull request itself somehow isolated from my changes?
You *should* create a separate branch for each pull request unless you're a masochist. There's *no* isolation other than whatever isolation YOU create. (Not my idea of award-winning software design, but meh, it is what it is). This is why people are adamant about making a separate branch for each pull request. *Technically* speaking you don't absolutely HAVE to...But if you *don't* create a separate branch for each PR, you're just asking for pain: It'll be a PITA if you want to create another PR before your first one is approved and merged. And it'll be a PITA if your PR is rejected and you want to do any more work on the codebase.
[...] Just think of it as updating a graph. You have a local copy of the graph, and you've added a bunch of new nodes to it. Now you want the upstream people to add your new nodes to their copies of the graph too. Suppose further that these nodes represent several different changesets. What's the best way to manage these nodes? It should be obvious that the best way is to use a different pointer for each changeset, so that if the upstream people decide to merge changeset A but reject changeset B, you can keep your local copy of the graph straight. If you use the *same* pointer for all changesets, then it should be no surprise when things become a big mess when upstream merges some changesets but not others, yet locally you have no way of addressing each changeset separately. Even if all your changes eventually get merged, in the interim you may be running git rebase to apply your changes to the latest upstream code; if you only keep a single pointer around for everything, you're going to lose track of what's going on really quickly. There's no *requirement* that you do things this way, of course, but it's just a matter of being able to keep your own changesets straight when you have to reconcile your local graph with the remote one. T -- Never wrestle a pig. You both get covered in mud, and the pig likes it.
Dec 05 2017
prev sibling next sibling parent crimaniak <crimaniak gmail.com> writes:
On Sunday, 3 December 2017 at 20:05:47 UTC, bitwise wrote:
 How does one keep their fork up to date? For example, if I fork
https://help.github.com/articles/syncing-a-fork/
Dec 04 2017
prev sibling next sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Sunday, 3 December 2017 at 20:05:47 UTC, bitwise wrote:
 I've finally started learning git, due to our team expanding 
 beyond one person - awesome, right? Anyways, I've got things 
 more or less figured out, which is nice, because being clueless 
 about git is a big blocker for me trying to do any real work on 
 dmd/phobos/druntime. [snip]
Here's my usual workflow. 1) Fork project 2) Add upstream 3) Create a new branch 4) Make changes 5) Add/Commit to branch 6) Push I sometimes find myself getting tripped up if I need to deviate from this. Ideally, I could just make the change, push it, and it gets accepted. Sometimes though you have to make changes to what you've done and add more commits and then the master has additional updates and you may need to handle merge conflicts. I make fewer mistakes now than when I started, but I'm still nowhere near as good with it as I should be.
Dec 04 2017
prev sibling next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 12/03/2017 12:05 PM, bitwise wrote:
 I've finally started learning git
git is one of those things where as soon as you understand how it works, you lose the ability to teach. :) I'm watching this thread with amusement because like most online tutorials, nobody is mentioning the relationship of *three* repos in the picture: - The original repo, which will be updated by others frequently - Your clone of it on GitHub which will be hopelessly behind unless you update it (if I'm not mistaken, none of the replies mentioned '-force') - Your local (e.g. laptop) clone of your GitHub clone, where you do all the work Dear git experts, given 3 repos, now what are the steps? Is the following correct? What are the exact commands? - Only once, create the original repo as an upstream of your local repo. - For each change: 1) Fetch from upstream 2) Rebase origin/master (on upstream, right?) 3) Make changes 4) Commit (potentially after 'add') 5) Repeat steps 3 and 4 as needed 6) 'git push -force' so that your GitHub repo is up-to-date right? (There, I mentioned "force". :) ) 7) Go to GitHub and press the big button to create a pull request Since I still don't know how git works, :) I trust my steps above but I know the steps can use improvement. Ali
Dec 04 2017
next sibling parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 12/04/2017 12:14 PM, Ali Çehreli wrote:

 2) Rebase origin/master (on upstream, right?)
2.5) Create a branch and do all work on that branch
 3) Make changes
Ali
Dec 04 2017
prev sibling next sibling parent Eugene Wissner <belka caraus.de> writes:
On Monday, 4 December 2017 at 20:14:15 UTC, Ali Çehreli wrote:
 6) 'git push -force' so that your GitHub repo is up-to-date 
 right? (There, I mentioned "force". :) )
The right option name is --force-with-lease ).
Dec 04 2017
prev sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 12/4/17 3:14 PM, Ali Çehreli wrote:

 Dear git experts, given 3 repos, now what are the steps? Is the 
 following correct? What are the exact commands?
Disclaimer: I'm not a git expert.
 - Only once, create the original repo as an upstream of your local repo.
The wording is off (s/repo/remote), but yes. This one I always have to look up, because I don't remember the order of the URL vs. the name, and git doesn't care if you swap them. But the command is this: git remote add upstream <url> Where url is the https version of dlang's repository (IMPORTANT: for dlang members, do NOT use the ssh version, as then you can accidentally push to it without any confirmation).
 
 - For each change:
 
 1) Fetch from upstream
git fetch upstream will fetch EVERYTHING, all branches. But just for reference so you can use it without affecting your local repo.
 
 2) Rebase origin/master (on upstream, right?)
No, don't do this every time! If you never commit to your local master (which you shouldn't), you do it this way: git checkout master git merge --ff-only upstream/master Optionally you can push your local master to origin, but it's not strictly necessary: git push origin master
 
 3) Make changes
Step 2.1: checkout a local branch. You can even do this after you have changed some files but *before* you have committed them (IMO one of the best features of git as compared to, say, subversion). checks it out. git checkout -b mylocalfix
 
 4) Commit (potentially after 'add')
Protip: git commit -a automatically adds any modified files that are currently in the repo, but have been changed.
 
 5) Repeat steps 3 and 4 as needed
Correct!
 
 6) 'git push -force' so that your GitHub repo is up-to-date right? 
 (There, I mentioned "force". :) )
I'd say: git push origin mylocalfix This pushes *only* your mylocalfix branch to *your* fork of the repo. No need to force as long as you do not want to squash. Squashing is when you merge multiple commits into one commit so that the history looks cleaner. I'd recommend never using force (btw, it's --force with 2 dashes) unless it complains. And then, make sure you aren't doing something foolish before using the --force command! Because you are committing only to your fork, and to your branch, even if you mess up here, it's pretty easy to recover. A couple of examples: 1. You commit, but missed a space between "if" and "(". Instead of generating a commit and log for the typo, you just squash the new commit into the first one. 2. You commit a work in progress, but then change the design. The first commit is useless in the history, as it probably doesn't even apply anymore, so you squash them together, as if you only ever committed the correct version. To squash the last few commits, I recommend using rebase -i: git rebase -i HEAD~3 This will pop up an editor. Follow the instructions listed there! If you want to squash 2 commits together, use "fixup", or even just "f". Note: do NOT "fixup" your first commit, as this will try to squash into someone else's commit that happened before you changed anything! Once you write the file and exit, git will rebase using your directions. At this point you need to use --force to push (as long as you have already pushed before), as your commit history now differs from github's.
 
 7) Go to GitHub and press the big button to create a pull request
Correct! After you do this, you can continue to run steps 3, 4, 6 to update your PR. One further step that I like to do to keep my repo clean: 8) When your PR is pulled: git fetch upstream git checkout master git merge --ff-only upstream/master git branch -d mylocalfix This pulls the new changes that were successfully merged into dlang's master into your master. Then it deletes the mylocalfix branch (no longer needed). The lower case -d means to only delete if the changes have been merged (git will complain if they aren't in the history). This is a nice way to clean your local branches up, and verify there isn't anything amiss. Note also, if you want to work on several fixes at once, you can checkout more than one local branch, and switch between them. Just remember to commit before you checkout the different branches (git will complain if you have uncommitted files, but not files that haven't ever been added). Hope this all helps! -Steve
Dec 05 2017
prev sibling next sibling parent Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:
I'm going to answer with something that others may not agree 
with, maybe they can enlighten me, but let me first get a generic 
principle of git and answer some questions.

Git has 2 types of branches, local branches (you know them as 
just branches) and remotes (which have their own local branches). 
I say this to remove the confusion with having an original 
repository, a forked repository, and a cloned repository. When it 
comes to interactions with these repositories the only difference 
from your local branches is that you can't interact directly with 
them (i.e. you can't commit to them) and the interactions require 
specifying the remote location of the branch.

Some of your other questions are about GitHub and Forking. Git 
doesn't know what a fork is, but GitHub ties a pull request to a 
branch. This means you can update/change your pull request by 
updating/changing your branch. From that you should realize each 
pull request needs its own branch.

Back to git remotes. I'm sure you're aware of the commonly named 
"origin" remote and possible the second common "upstream." When 
you're dealing with many remotes your local branches can get 
complicated. For example many people utilize 'master' as a 
*tracking* branch to "origin" well, "upstream" if there is one. 
I'm to the point that in this situation my recommendation is just 
delete your 'master' branch both local and "origin." Don't worry 
you can bring it back if you need it, you won't need it.

Here is the thing, you already have two, maybe 3 master branches 
'master' 'origin/master' 'upstream/master' these are local 
branches (they are special in that you can't commit to them). And 
these are also your true tracking branches, whenever you 
fetch/pull from your remote these branches are updated, they will 
always reflect the branch of the remote and they will never 
conflict during updates. You can always create your own master $ 
git branch master upstream/master

I want to note that 'origin/master' is different from such 
commands as `$ git push origin master` because in the first you 
are referring to a local branch and the second you reference your 
remote followed by your remotes local branch (actually I could be 
wrong here because the full syntax is `$ git push origin 
master:master` where the left: is your local and :right is the 
remote local branch [and note that `push origin :master` would 
delete the remote master branch because you're push no branch to 
it.).

I hope that this along with answers other have given will help 
you to answer all of your questions.
Dec 05 2017
prev sibling parent John Gabriele <jgabriele fastmail.fm> writes:
On Sunday, 3 December 2017 at 20:05:47 UTC, bitwise wrote:
 {snip} If anyone can offer any kind of advice, or an article 
 that explains these things concisely and effectively, that 
 would be helpful.
I found some git-specific info in this wiki page: <https://wiki.dlang.org/Starting_as_a_Contributor>
Dec 05 2017