Converting a list into a tab separated file && grouped by values

Hello nixCraft,

This is my first post here and I’m actually a bit excited to have joined. I’ve seen this website in almost all searches i’ve done over the many years I’ve used and dabbled in Linux. I’m hoping the community is inviting to novice shell scripters. So let me share.

I have a text file with a list of values. Here is a snip:

|01BFRUITS|
^banana
^apple
^orange
^pear
|01AELECTRONICS|
^television
^radio
^dishwasher
^computer
|01AANIMAL|
^bear
^cat
^dog
^elephant
|01ASHAPE|
^circle
^square
^diamond
^star

Values starting with a PIPE can be considered headers and values with a CARET are values to the header it is under.

My goal is to create a ‘tab separated value’ file with the headers on the left. So after much googling, man files, and some other forum help… I’ve managed to come up with two commands… both output differently in my efforts.

First command:

cat test.txt | awk -v OFS='\t' '/^\|/{ c1=$0; gsub(/\|/,"",c1) } /^\^/{ c2=$0; sub(/^\^/,"",c2); print c1,OFS,c2 }' | sed -z s/\\r\\t\\t//g

Which produced:

01BFRUITS       banana
01BFRUITS       apple
01BFRUITS       orange
01BFRUITS       pear
01AELECTRONICS  television
01AELECTRONICS  radio
01AELECTRONICS  dishwasher
01AELECTRONICS  computer
01AANIMAL       bear
01AANIMAL       cat
01AANIMAL       dog
01AANIMAL       elephant
01ASHAPE        circle
01ASHAPE        square
01ASHAPE        diamond
01ASHAPE        star

The second command is:

cat test.txt | sed -z 's/\r\n\^/\t/g' | tr -d '|'

Which produced:

01BFRUITS       banana  apple   orange  pear
01AELECTRONICS  television      radio   dishwasher      computer
01AANIMAL       bear    cat     dog     elephant
01ASHAPE        circle  square  diamond star

Now my list has unique values in my test run. My new list has duplicates like so:

|01BFRUITS|
^banana
^apple
^orange
^pear
^banana
^apple
^orange
^pear
|01AELECTRONICS|
^television
^radio
^dishwasher
^computer
^television
^radio
^dishwasher
^computer
^television
^radio
^dishwasher
^computer
|01AANIMAL|
^bear
^cat
^dog
^elephant
^bear
^cat
^dog
^elephant
^bear
^cat
^dog
^elephant
^bear
^cat
^dog
^elephant
|01ASHAPE|
^circle
^square
^diamond
^star
^circle
^square
^diamond
^star
^circle
^square
^diamond
^star
^circle
^square
^diamond
^star
^circle
^square
^diamond
^star

And the desired output I am looking for is this:

01BFRUITS	banana	banana			
01BFRUITS	apple	apple			
01BFRUITS	orange	orange			
01BFRUITS	pear	pear			
01AELECTRONICS	television	television	television		
01AELECTRONICS	radio	radio	radio		
01AELECTRONICS	dishwasher	dishwasher	dishwasher		
01AELECTRONICS	computer	computer	computer		
01AANIMAL	bear	bear	bear	bear	
01AANIMAL	cat	cat	cat	cat	
01AANIMAL	dog	dog	dog	dog	
01AANIMAL	elephant	elephant	elephant	elephant	
01ASHAPE	circle	circle	circle	circle	circle
01ASHAPE	square	square	square	square	square
01ASHAPE	diamond	diamond	diamond	diamond	diamond
01ASHAPE	star	star	star	star	star

My intention is to group all values of the same value together and maintain the left header. I have no idea how to approach it with awk, sed, or tr. I did manage a way in excel, but the processing power it takes on my old computer is annoying. I think the cli will speed things up greatly.

My hardware is a RPi 4 running raspbian.

Any help or solutions will be greatly appreciated

Was able to resolve this using ChatGPT:

cat test.txt | sed -z 's/\r\n\^/,/g' | tr -d '|' | awk -F, "{ key = \$1; for (i=2; i<=NF; i++) { values[key][\$i] = values[key][\$i]\",\"\$i } } END { for (key in values) { for (value in values[key]) { printf \"%s%s\n\", key, values[key][value] } } }"

I did wind up changing the file to a csv and not tabs.