Created
June 28, 2011 14:31
-
-
Save xarg/1051247 to your computer and use it in GitHub Desktop.
Douglas-Peucker
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# pure-Python Douglas-Peucker line simplification/generalization | |
# | |
# this code was written by Schuyler Erle <[email protected]> and is | |
# made available in the public domain. | |
# | |
# the code was ported from a freely-licensed example at | |
# http://www.3dsoftware.com/Cartography/Programming/PolyLineReduction/ | |
# | |
# the original page is no longer available, but is mirrored at | |
# http://www.mappinghacks.com/code/PolyLineReduction/ | |
""" | |
>>> line = [(0,0),(1,0),(2,0),(2,1),(2,2),(1,2),(0,2),(0,1),(0,0)] | |
>>> simplify_points(line, 1.0) | |
[(0, 0), (2, 0), (2, 2), (0, 2), (0, 0)] | |
>>> line = [(0,0),(0.5,0.5),(1,0),(1.25,-0.25),(1.5,.5)] | |
>>> simplify_points(line, 0.25) | |
[(0, 0), (0.5, 0.5), (1.25, -0.25), (1.5, 0.5)] | |
""" | |
import math | |
def simplify_points (pts, tolerance): | |
anchor = 0 | |
floater = len(pts) - 1 | |
stack = [] | |
keep = set() | |
stack.append((anchor, floater)) | |
while stack: | |
anchor, floater = stack.pop() | |
# initialize line segment | |
if pts[floater] != pts[anchor]: | |
anchorX = float(pts[floater][0] - pts[anchor][0]) | |
anchorY = float(pts[floater][1] - pts[anchor][1]) | |
seg_len = math.sqrt(anchorX ** 2 + anchorY ** 2) | |
# get the unit vector | |
anchorX /= seg_len | |
anchorY /= seg_len | |
else: | |
anchorX = anchorY = seg_len = 0.0 | |
# inner loop: | |
max_dist = 0.0 | |
farthest = anchor + 1 | |
for i in range(anchor + 1, floater): | |
dist_to_seg = 0.0 | |
# compare to anchor | |
vecX = float(pts[i][0] - pts[anchor][0]) | |
vecY = float(pts[i][1] - pts[anchor][1]) | |
seg_len = math.sqrt( vecX ** 2 + vecY ** 2 ) | |
# dot product: | |
proj = vecX * anchorX + vecY * anchorY | |
if proj < 0.0: | |
dist_to_seg = seg_len | |
else: | |
# compare to floater | |
vecX = float(pts[i][0] - pts[floater][0]) | |
vecY = float(pts[i][1] - pts[floater][1]) | |
seg_len = math.sqrt( vecX ** 2 + vecY ** 2 ) | |
# dot product: | |
proj = vecX * (-anchorX) + vecY * (-anchorY) | |
if proj < 0.0: | |
dist_to_seg = seg_len | |
else: # calculate perpendicular distance to line (pythagorean theorem): | |
dist_to_seg = math.sqrt(abs(seg_len ** 2 - proj ** 2)) | |
if max_dist < dist_to_seg: | |
max_dist = dist_to_seg | |
farthest = i | |
if max_dist <= tolerance: # use line segment | |
keep.add(anchor) | |
keep.add(floater) | |
else: | |
stack.append((anchor, farthest)) | |
stack.append((farthest, floater)) | |
keep = list(keep) | |
keep.sort() | |
return [pts[i] for i in keep] | |
if __name__ == "__main__": | |
import doctest | |
doctest.testmod() |
@stevastator: good observation, the if statement (if max_dist < dist_to_seg:) on line 71 should be outside the else block starting at line 60. line 71 should belong to the top level block of the for loop (at line 50), same level as lines 51-60. The porting should consider the curly braces in cpp instead of indentation in cpp. there is a geometrical difference if not fixed. thanks stevastator !
Couldn't you calculate the dist_len only when needed? Ie calculate line 55 at line 59 and not before, same for 64/68. You would save some expensive sqrt calculations.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
this is great code that I have actually used a lot myself. Today - by accident - I inspected it a little closer once more and found a confusing issue:
The max_dist test in line 71 is performed only within the branch starting at line 60. In other words, if the condition in line 58 is true (next point lies "behind" current segment), the value of dist_to_seg established in line 59 is never used and the current point will never be considered as a point to keep in this iteration, regardless of its actual shortest distance to the segment.
Although the code posted here is exactly like all other occurrences of that code I found on the web, I still have the strong feeling that this is actually a bug: The check in line 71 should be outside the if-else clause starting in line 58 (i.e., should be one indent level less) and should be performed each iteration.
I believe so for the following reasons:
a) considering what the algorithm is intended for, the current scheme doesn't seem to make sense geometrically (also after reading the cited site)
b) looking closer at the original C++ code on the cited site, it can be observed that the indentation used at that point is quite confusing and might likely have led to such error during porting to python. One has to carefully inspect the curly brackets of the original code in order to see that there the corresponding check is always performed.
Or am I missing something here? Can you comment on that? I am grateful for anyone proving me wrong.
Thanks and best regards