Python Scripts >  Bash Beautifier

 

Name       Bash Beautifier
Description

BeautifyBash has three modes of operation:

If presented with a list of file names —
beautify_bash.py file1.sh file2.sh file3.sh
— for each file name, it will create a backup (i.e. file1.sh~) and overwrite the original file with a beautified replacement.
If given '-' as a command-line argument, it will use stdin as its source and stdout as its sink:
beautify_bash.py - < infile.sh > outfile.sh
If called as a module, it will behave itself and not execute its main() function:
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from beautify_bash import BeautifyBash

[ ... ]

result,error = BeautifyBash().beautify_string(source)

BeautifyBash handles Bash here-docs very carefully (and there are probably some border cases it doesn't handle).
The basic idea is that the originator knew what format he wanted in the here-doc,
and a beautifier shouldn't try to outguess him. So BeautifyBash does all it can to pass along the here-doc content unchanged:

if true
then

echo "Before here-doc"

# Insert 2 lines in file, then save.
#--------Begin here document-----------#
vi $TARGETFILE <<x23LimitStringx23
i
This is line 1 of the example file.
This is line 2 of the example file.
^[
ZZ
x23LimitStringx23
#----------End here document-----------#

echo "After here-doc"

fi
As written, BeautifyBash can beautify large numbers of Bash scripts when called from ... well, among other things, a Bash script:

#!/bin/sh

for path in `find /path -name '*.sh'`
do
bash_beautify.py $path
done
As well as the more obvious example:

$ beautify_bash.py *.sh
CAUTION: Because BeautifyBash overwrites all the files submitted to it,
this could have disastrous consequences if the files include some of the increasingly
common Bash scripts that have appended binary content (a regime where BeautifyBash's behavior is undefined).
So please — back up your files, and don't treat BeautifyBash as though it is a harmless utility. That's only true most of the time.

OS       Linux/UNIX
License       GNU General Public License

 

Program Listing

1: #!/usr/bin/env python
2: # -*- coding: utf-8 -*-
3:
4: #**************************************************************************
5: # Copyright (C) 2011, Paul Lutus *
6: # *
7: # This program is free software; you can redistribute it and/or modify *
8: # it under the terms of the GNU General Public License as published by *
9: # the Free Software Foundation; either version 2 of the License, or *
10: # (at your option) any later version. *
11: # *
12: # This program is distributed in the hope that it will be useful, *
13: # but WITHOUT ANY WARRANTY; without even the implied warranty of *
14: # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the *
15: # GNU General Public License for more details. *
16: # *
17: # You should have received a copy of the GNU General Public License *
18: # along with this program; if not, write to the *
19: # Free Software Foundation, Inc., *
20: # 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. *
21: #**************************************************************************
22:
23: import re, sys
24:
25: PVERSION = '1.0'
26:
27: class BeautifyBash:
28:
29: def __init__(self):
30: self.tab_str = ' '
31: self.tab_size = 2
32:
33: def read_file(self,fp):
34: with open(fp) as f:
35: return f.read()
36:
37: def write_file(self,fp,data):
38: with open(fp,'w') as f:
39: f.write(data)
40:
41: def beautify_string(self,data,path = ''):
42: tab = 0
43: case_stack = []
44: in_here_doc = False
45: defer_ext_quote = False
46: in_ext_quote = False
47: ext_quote_string = ''
48: here_string = ''
49: output = []
50: line = 1
51: for record in re.split('\n',data):
52: record = record.rstrip()
53: stripped_record = record.strip()
54:
55: # collapse multiple quotes between ' ... '
56: test_record = re.sub(r'\'.*?\'','',stripped_record)
57: # collapse multiple quotes between " ... "
58: test_record = re.sub(r'".*?"','',test_record)
59: # collapse multiple quotes between ` ... `
60: test_record = re.sub(r'`.*?`','',test_record)
61: # collapse multiple quotes between \` ... ' (weird case)
62: test_record = re.sub(r'\\`.*?\'','',test_record)
63: # strip out any escaped single characters
64: test_record = re.sub(r'\\.','',test_record)
65: # remove '#' comments
66: test_record = re.sub(r'(\A|\s)(#.*)','',test_record,1)
67: if(not in_here_doc):
68: if(re.search('<<-?',test_record)):
69: here_string = re.sub('.*<<-?\s*[\'|"]?([_|\w]+)[\'|"]?.*','\\1',stripped_record,1)
70: in_here_doc = (len(here_string) > 0)
71: if(in_here_doc): # pass on with no changes
72: output.append(record)
73: # now test for here-doc termination string
74: if(re.search(here_string,test_record) and not re.search('<<',test_record)):
75: in_here_doc = False
76: else: # not in here doc
77: if(in_ext_quote):
78: if(re.search(ext_quote_string,test_record)):
79: # provide line after quotes
80: test_record = re.sub('.*%s(.*)' % ext_quote_string,'\\1',test_record,1)
81: in_ext_quote = False
82: else: # not in ext quote
83: if(re.search(r'(\A|\s)(\'|")',test_record)):
84: # apply only after this line has been processed
85: defer_ext_quote = True
86: ext_quote_string = re.sub('.*([\'"]).*','\\1',test_record,1)
87: # provide line before quote
88: test_record = re.sub('(.*)%s.*' % ext_quote_string,'\\1',test_record,1)
89: if(in_ext_quote):
90: # pass on unchanged
91: output.append(record)
92: else: # not in ext quote
93: inc = len(re.findall('(\s|\A|;)(case|then|do)(;|\Z|\s)',test_record))
94: inc += len(re.findall('(\{|\(|\[)',test_record))
95: outc = len(re.findall('(\s|\A|;)(esac|fi|done|elif)(;|\)|\||\Z|\s)',test_record))
96: outc += len(re.findall('(\}|\)|\])',test_record))
97: if(re.search(r'\besac\b',test_record)):
98: if(len(case_stack) == 0):
99: sys.stderr.write(
100: 'File %s: error: "esac" before "case" in line %d.\n' % (path,line)
101: )
102: else:
103: outc += case_stack.pop()
104: # sepcial handling for bad syntax within case ... esac
105: if(len(case_stack) > 0):
106: if(re.search('\A[^(]*\)',test_record)):
107: # avoid overcount
108: outc -= 2
109: case_stack[-1] += 1
110: if(re.search(';;',test_record)):
111: outc += 1
112: case_stack[-1] -= 1
113: # an ad-hoc solution for the "else" keyword
114: else_case = (0,-1)[re.search('^(else)',test_record) != None]
115: net = inc - outc
116: tab += min(net,0)
117: extab = tab + else_case
118: extab = max(0,extab)
119: output.append((self.tab_str * self.tab_size * extab) + stripped_record)
120: tab += max(net,0)
121: if(defer_ext_quote):
122: in_ext_quote = True
123: defer_ext_quote = False
124: if(re.search(r'\bcase\b',test_record)):
125: case_stack.append(0)
126: line += 1
127: error = (tab != 0)
128: if(error):
129: sys.stderr.write('File %s: error: indent/outdent mismatch: %d.\n' % (path,tab))
130: return '\n'.join(output), error
131:
132: def beautify_file(self,path):
133: error = False
134: if(path == '-'):
135: data = sys.stdin.read()
136: result,error = self.beautify_string(data,'(stdin)')
137: sys.stdout.write(result)
138: else: # named file
139: data = self.read_file(path)
140: result,error = self.beautify_string(data,path)
141: if(data != result):
142: # make a backup copy
143: self.write_file(path + '~',data)
144: self.write_file(path,result)
145: return error
146:
147: def main(self):
148: error = False
149: sys.argv.pop(0)
150: if(len(sys.argv) < 1):
151: sys.stderr.write('usage: shell script filenames or \"-\" for stdin.\n')
152: else:
153: for path in sys.argv:
154: error |= self.beautify_file(path)
155: sys.exit((0,1)[error])
156:
157: # if not called as a module
158: if(__name__ == '__main__'):
159: BeautifyBash().main()
160: