How can I extract records with a column in common?
I have two tables with different numbers of columns and rows. I want to find rows using a common column (column B is in common). Here is an example. Could you please help?
file1.txt
A B C D
a b c d
i ii iii iV
* ** # ##
file2.txt
E B
f ff
h b
g gg
k ii
output:
A B C D E
a b c d h
i ii iii iV k
command-line text-processing
add a comment |
I have two tables with different numbers of columns and rows. I want to find rows using a common column (column B is in common). Here is an example. Could you please help?
file1.txt
A B C D
a b c d
i ii iii iV
* ** # ##
file2.txt
E B
f ff
h b
g gg
k ii
output:
A B C D E
a b c d h
i ii iii iV k
command-line text-processing
2
Could you explain what you are doing in a bit more detail please? Why is the last line offile1
not in the output? And why is column B unchanged in the output while column E is added?
– terdon♦
Dec 17 at 11:59
@terdon: I have two tables including some parameters of many galaxies. Assume table one include "mass" , "name","color" and "distance" of galaxies and table 2 include "velocity" and "name". . some galaxies are in common so have same name (column B). Now, I want to have all parameters for this galaxies in one table.
– Negar
Dec 17 at 12:46
add a comment |
I have two tables with different numbers of columns and rows. I want to find rows using a common column (column B is in common). Here is an example. Could you please help?
file1.txt
A B C D
a b c d
i ii iii iV
* ** # ##
file2.txt
E B
f ff
h b
g gg
k ii
output:
A B C D E
a b c d h
i ii iii iV k
command-line text-processing
I have two tables with different numbers of columns and rows. I want to find rows using a common column (column B is in common). Here is an example. Could you please help?
file1.txt
A B C D
a b c d
i ii iii iV
* ** # ##
file2.txt
E B
f ff
h b
g gg
k ii
output:
A B C D E
a b c d h
i ii iii iV k
command-line text-processing
command-line text-processing
edited Dec 17 at 11:36
asked Dec 17 at 11:31
Negar
464
464
2
Could you explain what you are doing in a bit more detail please? Why is the last line offile1
not in the output? And why is column B unchanged in the output while column E is added?
– terdon♦
Dec 17 at 11:59
@terdon: I have two tables including some parameters of many galaxies. Assume table one include "mass" , "name","color" and "distance" of galaxies and table 2 include "velocity" and "name". . some galaxies are in common so have same name (column B). Now, I want to have all parameters for this galaxies in one table.
– Negar
Dec 17 at 12:46
add a comment |
2
Could you explain what you are doing in a bit more detail please? Why is the last line offile1
not in the output? And why is column B unchanged in the output while column E is added?
– terdon♦
Dec 17 at 11:59
@terdon: I have two tables including some parameters of many galaxies. Assume table one include "mass" , "name","color" and "distance" of galaxies and table 2 include "velocity" and "name". . some galaxies are in common so have same name (column B). Now, I want to have all parameters for this galaxies in one table.
– Negar
Dec 17 at 12:46
2
2
Could you explain what you are doing in a bit more detail please? Why is the last line of
file1
not in the output? And why is column B unchanged in the output while column E is added?– terdon♦
Dec 17 at 11:59
Could you explain what you are doing in a bit more detail please? Why is the last line of
file1
not in the output? And why is column B unchanged in the output while column E is added?– terdon♦
Dec 17 at 11:59
@terdon: I have two tables including some parameters of many galaxies. Assume table one include "mass" , "name","color" and "distance" of galaxies and table 2 include "velocity" and "name". . some galaxies are in common so have same name (column B). Now, I want to have all parameters for this galaxies in one table.
– Negar
Dec 17 at 12:46
@terdon: I have two tables including some parameters of many galaxies. Assume table one include "mass" , "name","color" and "distance" of galaxies and table 2 include "velocity" and "name". . some galaxies are in common so have same name (column B). Now, I want to have all parameters for this galaxies in one table.
– Negar
Dec 17 at 12:46
add a comment |
2 Answers
2
active
oldest
votes
You can do this kind of thing by building a hash / associative array / lookup table e.g. using Awk:
$ awk 'NR==FNR{B[$2]=$1; next} $2 in B {print $0,B[$2]}' file2.txt file1.txt
A B C D E
a b c d h
i ii iii iV k
There's also the join
command - but that requires inputs to be sorted on the common field.
It works. Thanks, but it doesn't print all columns of file1. could you please tell me how can I print all columns?
– Negar
Dec 17 at 15:11
@Negar it should print all of the original columns fromfile1.txt
(since$0
in Awk represents the whole record) - is there something additional about your actual files that you forgot to mention?
– steeldriver
Dec 17 at 15:26
@ steeldriver It works right. I made a small mistake. Thanks
– Negar
Dec 18 at 0:55
add a comment |
To add to the answer by @steeldriver, to do it with sort
and join
:
join -j 2 -o 1.1,0,1.3,1.4,2.1 <(sort -k 2 file1.txt) <(sort -k 2 file2.txt)
-j 2
tellsjoin
which field is the key.
-o
gives the order of the fields in the output, where0
is the common key, and the others areFILENUM.FIELD
. Seeman join
for details.
-k 2
tellssort
which field is the key.
<( )
isbash
process substitution.
Output is:
a b c d h
A B C D E
i ii iii iV k
1
You may find that the column header position can be preserved by enforcing an appropriate locale e.g.LC_COLLATE=C sort ...
– steeldriver
Dec 17 at 14:16
@Christoffer Hammarström: Thanks. could you please explain about 1.1, 0 ,1.3 ,... since my file1 has 13 columns and this command don't print all of them.
– Negar
Dec 17 at 15:13
@Negar: Edited. Seeman join
for further details.
– Christoffer Hammarström
Dec 17 at 15:17
linux.die.net/man/1/join
– Christoffer Hammarström
Dec 17 at 15:18
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "89"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1102529%2fhow-can-i-extract-records-with-a-column-in-common%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can do this kind of thing by building a hash / associative array / lookup table e.g. using Awk:
$ awk 'NR==FNR{B[$2]=$1; next} $2 in B {print $0,B[$2]}' file2.txt file1.txt
A B C D E
a b c d h
i ii iii iV k
There's also the join
command - but that requires inputs to be sorted on the common field.
It works. Thanks, but it doesn't print all columns of file1. could you please tell me how can I print all columns?
– Negar
Dec 17 at 15:11
@Negar it should print all of the original columns fromfile1.txt
(since$0
in Awk represents the whole record) - is there something additional about your actual files that you forgot to mention?
– steeldriver
Dec 17 at 15:26
@ steeldriver It works right. I made a small mistake. Thanks
– Negar
Dec 18 at 0:55
add a comment |
You can do this kind of thing by building a hash / associative array / lookup table e.g. using Awk:
$ awk 'NR==FNR{B[$2]=$1; next} $2 in B {print $0,B[$2]}' file2.txt file1.txt
A B C D E
a b c d h
i ii iii iV k
There's also the join
command - but that requires inputs to be sorted on the common field.
It works. Thanks, but it doesn't print all columns of file1. could you please tell me how can I print all columns?
– Negar
Dec 17 at 15:11
@Negar it should print all of the original columns fromfile1.txt
(since$0
in Awk represents the whole record) - is there something additional about your actual files that you forgot to mention?
– steeldriver
Dec 17 at 15:26
@ steeldriver It works right. I made a small mistake. Thanks
– Negar
Dec 18 at 0:55
add a comment |
You can do this kind of thing by building a hash / associative array / lookup table e.g. using Awk:
$ awk 'NR==FNR{B[$2]=$1; next} $2 in B {print $0,B[$2]}' file2.txt file1.txt
A B C D E
a b c d h
i ii iii iV k
There's also the join
command - but that requires inputs to be sorted on the common field.
You can do this kind of thing by building a hash / associative array / lookup table e.g. using Awk:
$ awk 'NR==FNR{B[$2]=$1; next} $2 in B {print $0,B[$2]}' file2.txt file1.txt
A B C D E
a b c d h
i ii iii iV k
There's also the join
command - but that requires inputs to be sorted on the common field.
answered Dec 17 at 12:00
steeldriver
65.7k11104177
65.7k11104177
It works. Thanks, but it doesn't print all columns of file1. could you please tell me how can I print all columns?
– Negar
Dec 17 at 15:11
@Negar it should print all of the original columns fromfile1.txt
(since$0
in Awk represents the whole record) - is there something additional about your actual files that you forgot to mention?
– steeldriver
Dec 17 at 15:26
@ steeldriver It works right. I made a small mistake. Thanks
– Negar
Dec 18 at 0:55
add a comment |
It works. Thanks, but it doesn't print all columns of file1. could you please tell me how can I print all columns?
– Negar
Dec 17 at 15:11
@Negar it should print all of the original columns fromfile1.txt
(since$0
in Awk represents the whole record) - is there something additional about your actual files that you forgot to mention?
– steeldriver
Dec 17 at 15:26
@ steeldriver It works right. I made a small mistake. Thanks
– Negar
Dec 18 at 0:55
It works. Thanks, but it doesn't print all columns of file1. could you please tell me how can I print all columns?
– Negar
Dec 17 at 15:11
It works. Thanks, but it doesn't print all columns of file1. could you please tell me how can I print all columns?
– Negar
Dec 17 at 15:11
@Negar it should print all of the original columns from
file1.txt
(since $0
in Awk represents the whole record) - is there something additional about your actual files that you forgot to mention?– steeldriver
Dec 17 at 15:26
@Negar it should print all of the original columns from
file1.txt
(since $0
in Awk represents the whole record) - is there something additional about your actual files that you forgot to mention?– steeldriver
Dec 17 at 15:26
@ steeldriver It works right. I made a small mistake. Thanks
– Negar
Dec 18 at 0:55
@ steeldriver It works right. I made a small mistake. Thanks
– Negar
Dec 18 at 0:55
add a comment |
To add to the answer by @steeldriver, to do it with sort
and join
:
join -j 2 -o 1.1,0,1.3,1.4,2.1 <(sort -k 2 file1.txt) <(sort -k 2 file2.txt)
-j 2
tellsjoin
which field is the key.
-o
gives the order of the fields in the output, where0
is the common key, and the others areFILENUM.FIELD
. Seeman join
for details.
-k 2
tellssort
which field is the key.
<( )
isbash
process substitution.
Output is:
a b c d h
A B C D E
i ii iii iV k
1
You may find that the column header position can be preserved by enforcing an appropriate locale e.g.LC_COLLATE=C sort ...
– steeldriver
Dec 17 at 14:16
@Christoffer Hammarström: Thanks. could you please explain about 1.1, 0 ,1.3 ,... since my file1 has 13 columns and this command don't print all of them.
– Negar
Dec 17 at 15:13
@Negar: Edited. Seeman join
for further details.
– Christoffer Hammarström
Dec 17 at 15:17
linux.die.net/man/1/join
– Christoffer Hammarström
Dec 17 at 15:18
add a comment |
To add to the answer by @steeldriver, to do it with sort
and join
:
join -j 2 -o 1.1,0,1.3,1.4,2.1 <(sort -k 2 file1.txt) <(sort -k 2 file2.txt)
-j 2
tellsjoin
which field is the key.
-o
gives the order of the fields in the output, where0
is the common key, and the others areFILENUM.FIELD
. Seeman join
for details.
-k 2
tellssort
which field is the key.
<( )
isbash
process substitution.
Output is:
a b c d h
A B C D E
i ii iii iV k
1
You may find that the column header position can be preserved by enforcing an appropriate locale e.g.LC_COLLATE=C sort ...
– steeldriver
Dec 17 at 14:16
@Christoffer Hammarström: Thanks. could you please explain about 1.1, 0 ,1.3 ,... since my file1 has 13 columns and this command don't print all of them.
– Negar
Dec 17 at 15:13
@Negar: Edited. Seeman join
for further details.
– Christoffer Hammarström
Dec 17 at 15:17
linux.die.net/man/1/join
– Christoffer Hammarström
Dec 17 at 15:18
add a comment |
To add to the answer by @steeldriver, to do it with sort
and join
:
join -j 2 -o 1.1,0,1.3,1.4,2.1 <(sort -k 2 file1.txt) <(sort -k 2 file2.txt)
-j 2
tellsjoin
which field is the key.
-o
gives the order of the fields in the output, where0
is the common key, and the others areFILENUM.FIELD
. Seeman join
for details.
-k 2
tellssort
which field is the key.
<( )
isbash
process substitution.
Output is:
a b c d h
A B C D E
i ii iii iV k
To add to the answer by @steeldriver, to do it with sort
and join
:
join -j 2 -o 1.1,0,1.3,1.4,2.1 <(sort -k 2 file1.txt) <(sort -k 2 file2.txt)
-j 2
tellsjoin
which field is the key.
-o
gives the order of the fields in the output, where0
is the common key, and the others areFILENUM.FIELD
. Seeman join
for details.
-k 2
tellssort
which field is the key.
<( )
isbash
process substitution.
Output is:
a b c d h
A B C D E
i ii iii iV k
edited Dec 17 at 15:17
answered Dec 17 at 13:30
Christoffer Hammarström
1514
1514
1
You may find that the column header position can be preserved by enforcing an appropriate locale e.g.LC_COLLATE=C sort ...
– steeldriver
Dec 17 at 14:16
@Christoffer Hammarström: Thanks. could you please explain about 1.1, 0 ,1.3 ,... since my file1 has 13 columns and this command don't print all of them.
– Negar
Dec 17 at 15:13
@Negar: Edited. Seeman join
for further details.
– Christoffer Hammarström
Dec 17 at 15:17
linux.die.net/man/1/join
– Christoffer Hammarström
Dec 17 at 15:18
add a comment |
1
You may find that the column header position can be preserved by enforcing an appropriate locale e.g.LC_COLLATE=C sort ...
– steeldriver
Dec 17 at 14:16
@Christoffer Hammarström: Thanks. could you please explain about 1.1, 0 ,1.3 ,... since my file1 has 13 columns and this command don't print all of them.
– Negar
Dec 17 at 15:13
@Negar: Edited. Seeman join
for further details.
– Christoffer Hammarström
Dec 17 at 15:17
linux.die.net/man/1/join
– Christoffer Hammarström
Dec 17 at 15:18
1
1
You may find that the column header position can be preserved by enforcing an appropriate locale e.g.
LC_COLLATE=C sort ...
– steeldriver
Dec 17 at 14:16
You may find that the column header position can be preserved by enforcing an appropriate locale e.g.
LC_COLLATE=C sort ...
– steeldriver
Dec 17 at 14:16
@Christoffer Hammarström: Thanks. could you please explain about 1.1, 0 ,1.3 ,... since my file1 has 13 columns and this command don't print all of them.
– Negar
Dec 17 at 15:13
@Christoffer Hammarström: Thanks. could you please explain about 1.1, 0 ,1.3 ,... since my file1 has 13 columns and this command don't print all of them.
– Negar
Dec 17 at 15:13
@Negar: Edited. See
man join
for further details.– Christoffer Hammarström
Dec 17 at 15:17
@Negar: Edited. See
man join
for further details.– Christoffer Hammarström
Dec 17 at 15:17
linux.die.net/man/1/join
– Christoffer Hammarström
Dec 17 at 15:18
linux.die.net/man/1/join
– Christoffer Hammarström
Dec 17 at 15:18
add a comment |
Thanks for contributing an answer to Ask Ubuntu!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1102529%2fhow-can-i-extract-records-with-a-column-in-common%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Could you explain what you are doing in a bit more detail please? Why is the last line of
file1
not in the output? And why is column B unchanged in the output while column E is added?– terdon♦
Dec 17 at 11:59
@terdon: I have two tables including some parameters of many galaxies. Assume table one include "mass" , "name","color" and "distance" of galaxies and table 2 include "velocity" and "name". . some galaxies are in common so have same name (column B). Now, I want to have all parameters for this galaxies in one table.
– Negar
Dec 17 at 12:46