Object Removal by Exemplar-Based Inpainting 翻譯

 

 

 

Object Removal by Exemplar-Based Inpainting

通過基於樣本塊的圖像修復來實現遮擋物移除

Abstract:

  A new algorithm is proposed for removing large objects from digital images. The challenge is to fill in the hole that is left behind in a visually plausible way.

  In the past, this problem has been addressed by two classes of algorithms: (i) “texture synthesis” algorithms for generating large image regions from sample textures, and (ii) “inpainting” techniques for filling in small image gaps. The former work well for “textures” – repeating twodimensional patterns with some stochasticity; the latter focus on linear “structures” which can be thought of as onedimensional patterns, such as lines and object contours.

  This paper presents a novel and efficient algorithm that combines the advantages of these two approaches. We first note that exemplar-based texture synthesis contains the essential process required to replicate both texture and structure; the success of structure propagation, however, is highly dependent on the order in which the filling proceeds. We propose a best-first algorithm in which the confidence in the synthesized pixel values is propagated in a manner similar to the propagation of information in inpainting. The actual colour values are computed using exemplar-based synthesis. Computational efficiency is achieved by a blockbased sampling process.

  A number of examples on real and synthetic images demonstrate the effectiveness of our algorithm in removing large occluding objects as well as thin scratches. Robustness with respect to the shape of the manually selected target region is also demonstrated. Our results compare favorably to those obtained by existing techniques.

摘要:

  針對移除數字圖像中大的物體,本文提出一個新的算法,它的挑戰在於用合適的方法填補圖像缺失的部分。

  在過去的方法中,過去,這一問題通過兩類算法來解決:(i)“紋理合成”算法:通過樣本紋理生成圖像丟失的大區域,(i i)“修復”技術:填充圖像的小間隙。前者在“紋理”上具有很好的效果——它通過一些隨機性的重複二維部分,後者着眼於線性結構,它可以被看做是一維模式,例如線條或者物體的輪廓。

  這篇文章結合上述兩種算法,提出了一種新的高效的方法。我們首先注意到基於樣例的紋理合成過程中需要包含必要的紋理和結構兩個方面的複製。然而,結構修復的成功很大程度上取決於填充的順序。我們提出了一種最佳優先算法,該算法合成像素值中的置信度(好像是一個評判指標,論文的後面應該會介紹到)的傳播類似於在修復圖像過程中信息傳播的方式傳播。實際顏色值的計算則是基於樣例的合成。計算效率是通過基於塊的採樣過程來實現的。

  在真實和合成圖像上的大量例子表明了我們的算法在去除大的物體遮擋和小的劃痕方面上都是有效的。並且對人工選擇的目標區域形狀的魯棒性也進行了論證。我們的結果優於現有技術的結果。

1. Introduction

  This paper presents a novel algorithm for removing objects from digital photographs and replacing them with visually plausible backgrounds. Figure 1 shows an example of this task, where the foreground person (manually selected as the target region) is replaced by textures sampled from the remainder of the image. The algorithm effectively hallucinates new colour values for the target region in a way that looks “reasonable” to the human eye.

1.介紹

  本文提出了一種新的算法,用於去除數字圖像中不需要的物體,並將其替換爲視覺上可信的背景。圖1顯示了這個任務的一個示例,其中前景人物(手動選擇作爲目標區域)被從圖像其餘部分採樣的紋理替換。該算法有效地爲目標區域產生新的顏色值,使人眼看起來“合理”。

  In previous work, several researchers have considered texture synthesis as a way to fill large image regions with “pure” textures – repetitive two-dimensional textural patterns with moderate stochasticity. This is based on a large body of texture-synthesis research, which seeks to replicate texture ad infinitum, given a small source sample of pure texture [1, 8, 9, 10, 11, 12, 14, 15, 16, 19, 22]. Of particular interest are exemplar-based techniques which cheaply and effectively generate new texture by sampling and copying colour values from the source [1, 9, 10, 11, 15].

  As effective as these techniques are in replicating consistent texture, they have difficulty filling holes in photographs of real-world scenes, which often consist of linear structures and composite textures – multiple textures interacting spatially [23]. The main problem is that boundaries between image regions are a complex product of mutual influences between different textures. In constrast to the twodimensional nature of pure textures, these boundaries form what might be considered more one-dimensional, or linear, image structures.

  A number of algorithms specifically address this issue for the task of image restoration, where speckles, scratches, and overlaid text are removed [2, 3, 4, 7, 20]. These image inpainting techniques fill holes in images by propagating linear structures (called isophotes in the inpainting literature) into the target region via diffusion. They are inspired by the partial differential equations of physical heat flow,and work convincingly as restoration algorithms. Their drawback is that the diffusion process introduces some blur, which is noticeable when the algorithm is applied to fill larger regions.

  The algorithm presented here combines the strengths of both approaches. As with inpainting, we pay special attention to linear structures. But, linear structures abutting the target region only influence the fill order of what is at core an exemplar-based texture synthesis algorithm. The result is an algorithm that has the efficiency and qualitative performance of exemplar-based texture synthesis, but which also respects the image constraints imposed by surrounding linear structures.

 之前的圖像修復工作中,一些研究人員認爲紋理合成是用“純”紋理去填充圖像中丟失的大區域的方法——用適度的隨機性重複二維紋理部分。這類方法基於大量的紋理合成研究去尋求複製紋理,並給出一個小的純紋理源樣本[1,8,9,10,11,12,14,15,16,19,22]。特別值得關注的是基於示例的技術,通過從源圖像中採樣和複製顏色值,可以通過較低的代價有效地生成新的紋理[1、9、10、11、15]。

  儘管這些技術在複製一致的紋理方面很有效,但它們很難填補真實場景照片中的漏洞。真實場景通常包括了線性結構和複合紋理結構,複合紋理結構是指多個紋理在空間上相互作用[23]。主要問題是圖像區域之間的邊界是不同紋理之間相互影響的複雜產物。在構造二維自然圖像的純紋理時,這些邊界形成了可能被認爲是一維或線性的圖像結構。 

  許多的算法專門去除斑點、劃痕和重疊文本[2、3、4、7、20]。這些圖像修復技術通過擴散將線性結構(在圖像修復文獻中稱爲等壓線)傳播到目標區域來填補圖像中的漏洞。它們受到物理熱流(?)偏微分方程的啓發,並且在恢復算法上取得了令人信服的成就。它們的缺點是擴散過程會引入一些模糊,當應用該算法填充較大的區域時,這種模糊很明顯。

   本文提出的算法結合了這兩種方法的優點。與圖像修復方法一樣,我們特別注意線性結構。但是,與目標區域相鄰區域的線性結構隻影響核心區域的填充順序,這是一種基於示例的紋理合成算法。該算法不僅具有基於實例的紋理合成的效率和定性性能,而且還考慮了周圍線性結構對圖像的約束。 

  Our algorithm builds on very recent research along similar lines. The work in [5] decomposes the original image into two components; one of which is processed by inpainting and the other by texture synthesis. The output image is the sum of the two processed components. This approach still remains limited to the removal of small image gaps, however, as the diffusion process continues to blur the filled region (cf., [5], fig.5 top right). The automatic switching between “pure texture-” and “pure structure-mode” described in [21] is also avoided.

  One of the first attempts to use exemplar-based synthesis specifically for object removal was by Harrison [13]. There, the order in which a pixel in the target region is filled was dictated by the level of “texturedness” of the pixel’s neighborhood. Although the intuition is sound, strong linear structures were often overruled by nearby noise, minimizing the value of the extra computation. A related technique drove the fill order by the local shape of the target region, but did not seek to explicitly propagate linear structure [6]. 

我們的算法建立在最近類似研究的基礎上。[5]中的工作將原始圖像分解爲兩個部分,一個部分通過着色處理,另一個部分通過紋理合成處理。輸出圖像是兩個處理的總和。但是,這種方法仍然侷限於去除較小的圖像間隙,因爲擴散過程會持續模糊填充區域(參見[5],圖5右上角)。它也避免了[21]中描述的“純紋理”和“純結構模式”之間的自動切換。 

Harrison[13]首次嘗試使用基於實例的合成來去除物體。在這裏,目標區域中的像素填充順序由像素鄰域的“紋理”級別決定。雖然想法是合理的,但強線性結構常常被附近的噪聲所幹擾,從而使額外計算的值最小化。相關技術通過目標區域的局部形狀來確定填充順序,但並未尋求明確擴散的線性結構[6]。 

Finally, Zalesny et al. [23] describe an interesting algorithm for the parallel synthesis of composite textures. They devise a special-purpose solution for the interface between two textures. In this paper we show that, in fact, only one mechanism is sufficient for the synthesis of both pure and composite textures.

Section 2 presents the key observation on which our algorithm depends. Section 3 describes the details of the algorithm.
Results on both synthetic and real imagery are presented in section 4.

最後,Zalesny等人[23]描述了一種有趣的複合紋理並行合成算法。他們爲兩種紋理之間的界面設計了一種特殊目的的解決方案。在本文中,我們證明,事實上,只有一種機制就足以合成純紋理和複合紋理。 

第2節介紹了我們的算法所依賴的關鍵實驗觀察。第3節描述了算法的細節,第4節給出了合成圖像和真實圖像的結果。 

2. Exemplar-based synthesis suffices

The core of our algorithm is an isophote-driven imagesampling process. It is well-understood that exemplarbased approaches perform well for two-dimensional textures [1, 9, 15]. But, we note in addition that exemplarbased texture synthesis is sufficient for propagating extended linear image structures, as well. A separate synthesis mechanism is not required for handling isophotes.

2.基於樣例的合成

我們算法的核心是一個等壓線驅動的圖像採樣過程。衆所周知,基於示例的方法在二維紋理方面表現良好[1,9,15]。但是,我們還注意到,基於示例的紋理合成對於擴展線性圖像結構的傳播也是足夠的。因此,處理等壓線並不需要單獨的合成機制。 

  Figure 2 illustrates this point. For ease of comparison, we adopt notation similar to that used in the inpainting literature. The region to be filled, i.e., the target region is indicated by Ω, and its contour is denoted δΩ. The contour evolves inward as the algorithm progresses, and so we also refer to it as the “fill front”. The source region, Φ, which remains fixed throughout the algorithm, provides samples used in the filling process.

  We now focus on a single iteration of the algorithm to show how structure and texture are adequately handled by exemplar-based synthesis. Suppose that the square template Ψp ∈ Ω centred at the point p (fig. 2b), is to be filled. The best-match sample from the source region comes from the patch Ψq ∈ Φ, which is most similar to those parts that are already filled in Ψp. In the example in fig. 2b, we see that if Ψp lies on the continuation of an image edge, the most likely best matches will lie along the same (or a similarly coloured) edge (e.g., Ψq and Ψq in fig. 2c). 

     All that is required to propagate the isophote inwards is a simple transfer of the pattern from the best-match source patch (fig. 2d). Notice that isophote orientation is automatically preserved. In the figure, despite the fact that the original edge is not orthogonal to the target contour δΩ, the propagated structure has maintained the same orientation as in the source region.

 圖2說明了這一點。爲了便於比較,我們採用了類似圖像修復文獻中使用的符號。要填充的區域,即目標區域用Ω表示,其輪廓用δΩ表示。輪廓隨着算法的發展而向內填充,因此我們也將其稱爲“填充線”。源區域Φ在整個算法中保持不變,它提供填充過程中使用的樣本。

現在我們將重點放在算法的某一次迭代上,以展示如何通過基於示例的合成充分處理結構和紋理。假設要填充以點P爲中心的方形模板θp∈Ω(如圖2b),源區的最佳匹配樣本來自於Ψq∈Φ,Ψq是與Ψp已填補部分最相似的部分。在圖2b中的示例中,我們發現,如果Ψp位於圖像邊緣的延續上,則最可能的最佳匹配將位於相同(或顏色相似)的邊緣上(如圖2c中的Ψq和Ψq)。 

所有需要填充的等壓線向內傳播是通過一個簡單的模式,從最佳匹配的源圖像塊轉移到待修復的塊中(圖2d)。請注意,等壓線方向是自動保留的。在圖中,儘管原始邊緣不與目標輪廓δΩ正交,但填充部分與源區域保持相同的方向。

3. Region-filling algorithm

We now proceed with the details of our algorithm.

First, a user selects a target region, Ω, to be removed and filled. The source region, Φ, may be defined as the entire image minus the target region (Φ = I−Ω), as a dilated band around the target region, or it may be manually specified by the user.

Next, as with all exemplar-based texture synthesis [10], the size of the template window Ψ must be specified. We provide a default window size of 9×9 pixels, but in practice require the user to set it to be slightly larger than the largest distinguishable texture element, or “texel”, in the source region.

3. 區域填充算法

我們現在開始詳細介紹我們的算法。

首先,標記要填充的目標區域Ω。源區域Φ可以定義爲整個圖像減去目標區域(Φ=I−Ω),作爲目標區域周圍的擴展帶,也可以通過手動指定。

接下來,與基於示例的紋理合成方法[10]一樣,必須指定模板窗口的大小ψ。我們提供了9×9像素的默認窗口大小,但實際上需要用戶將其設置爲略大於源區域中最大的可分辨紋理元素(texel)。

Once these parameters are determined, the remainder of the region-filling process is completely automatic.

In our algorithm, each pixel maintains a colour value (or “empty”, if the pixel is unfilled) and a confidence value, which reflects our confidence in the pixel value, and which is frozen once a pixel has been filled. During the course of the algorithm, patches along the fill front are also given a temporary priority value, which determines the order in which they are filled. Then, our algorithm iterates the following three steps until all pixels have been filled:

3.1. Computing patch priorities.

Filling order is crucial to non-parametric texture synthesis [1, 6, 10, 13]. Thus far, the default favourite has been the “onion peel” method, where the target region is synthesized from the outside inward, in concentric layers. To our knowledge, however, designing a fill order which explicitly encourages propagation of linear structure (together with texture) has never been explored. Our algorithm performs this task through a best-first filling algorithm that depends entirely on the priority values that are assigned to each patch on the fill front. The priority computation is biased toward those patches which are on the continuation of strong edges and which are surrounded by high-confidence pixels. Given a patch Ψp centred at the point p for some p ∈ δΩ (see fig. 3), its priority P(p) is defined as the product of two terms:

一旦確定了這些參數,區域填充過程的其餘部分將完全自動進行。 

在我們的算法中,每個像素都保持一個顏色值(如果像素未填充,則爲“空”)和一個置信值,這反映了我們對像素值的信心,並且一旦像素被填充,就會被凍結。在算法的執行過程中,填充圖像塊的前面的塊也會被賦予一個臨時的優先級值,該值決定了補丁的填充順序。然後,我們的算法重複以下三個步驟,直到所有像素都被填滿:

3.1.計算修補程序優先級。

填充順序對非參數紋理合成至關重要[1,6,10,13]。到目前爲止,默認的最受歡迎的方法是“洋蔥皮”法,即目標區域是從外部向內,在同心層中合成的。然而,據我們所知,一個明確的鼓勵線性結構(結合紋理因素)傳播的填充順序從未被探索過。我們的算法通過最佳優先的填充算法來執行這項任務,該算法完全依賴於分配給填充前面的每個補丁的優先級值。優先級的計算偏向於那些在明顯邊界的延續上和被高置信像素包圍的補丁。給定一個以點p爲中心的圖像塊,對於一些p∈δΩ(見圖3),其優先級P(p)定義爲兩個術語的乘積: 

We call C(p) the confidence term and D(p) the data term, and they are defined as follows:

我們稱C(p)爲置信項,D(p)爲數據項,定義如下: 

where |Ψp| is the area of Ψp, α is a normalization factor (e.g., α = 255 for a typical grey-level image), and np is a unit vector orthogonal to the front δΩ in the point p. The priority is computed for every border patch, with distinct patches for each pixel on the boundary of the target region.

式中,|Ψp|是Ψp的面積,α是歸一化因子(例如,對於典型的灰度圖像而言,α=255),np是在點p上與前面的δΩ正交的單位向量。爲每個邊界圖像塊計算優先級,目標區域邊界上的每個像素都有不同的圖像塊。  

During initialization, the function C(p) is set to C(p) = 0 ∀p ∈ Ω, and C(p) = 1 ∀p ∈ I −Ω. The confidence term C(p) may be thought of as a measure of the amount of reliable information surrounding the pixel p. 

對於C(p)的初始化,函數C(p)設爲0時, ∀p∈Ω,C(p)設爲1時,∀p∈I−Ω。置信項C(p)可以被認爲是圍繞像素p的可靠信息量的度量。     

The intention is to fill first those patches which have more of their pixels already filled, with additional preference given to pixels that were filled early on (or that were never part of the target region).

其目的是首先填充那些已經填充了更多像素的補丁,並對早期填充的像素(或不屬於目標區域的像素)進行了額外的偏好設置。

This automatically incorporates preference towards certain shapes along the fill front. For example, patches that include corners and thin tendrils of the target region will tend to be filled first, as they are surrounded by more pixels from the original image. These patches provide more reliable information against which to match. Conversely, patches at the tip of “peninsulas” of filled pixels jutting into the target region will tend to be set aside until more of the surrounding pixels are filled in. 

這些自動合併傾向於先合併某些形狀或者是填充邊緣。例如,那些包含角或者細小紋理的目標區域塊往往首先被填充,因爲它們被原始圖像中的更多像素包圍(?)。這些塊提供了更可靠的匹配信息。相反,突出到目標區域的待填充“半島”尖端的塊將傾向於被留出,直到填充完更多的周圍像素後再填充。 

At a coarse level, the term C(p) of (1) approximately enforces the desirable concentric fill order. As filling proceeds, pixels in the outer layers of the target region will tend to be characterized by greater confidence values, and therefore be filled earlier; pixels in the centre of the target region will have lesser confidence values.

在粗略的水平上,第一次迭代的C(p)近似地執行所需的同心填充順序。隨着填充的進行,目標區域外層的像素將趨向於更大的置信值,因此會更早填充;目標區域中心的像素將具有更小的置信值。 

The data term D(p) is a function of the strength of isophotes hitting the front δΩ at each iteration. This term boosts the priority of a patch that an isophote “flows” into. This factor is of fundamental importance in our algorithm because it encourages linear structures to be synthesized first, and, therefore propagated securely into the target region. Broken lines tend to connect, thus realizing the “Connectivity Principle” of vision psychology [7, 17] (cf., fig. 4, fig. 7d, fig. 8b and fig. 13d).

數據項D(p)是在每次迭代時計算δΩ等壓線的強度的函數。這個過程提高了包括等壓線塊的優先級。這一因素在我們的算法中至關重要,因爲它鼓勵首先合成線性結構,從而安全地傳播到目標區域。斷線傾向於連接,從而實現視覺心理學的“連接原理”[7,17](參見圖4,圖7d,圖8b和圖13d)。

There is a delicate balance between the confidence and data terms. The data term tends to push isophotes rapidly inward, while the confidence term tends to suppress precisely this sort of incursion into the target region. As presented in the results section, this balance is handled gracefully via the mechanism of a single priority computation for all patches on the fill front.

置信值和數據項之間存在微妙的平衡。數據項傾向於迅速地向內推送等壓線,而置信項傾向於精確地抑制這種侵入目標區域的行爲。如結果部分所示,通過對填充區域上所有圖像塊的單個優先級計算機制,可以很好地處理此平衡。                       

Since the fill order of the target region is dictated solely by the priority function P(p), we avoid having to predefine an arbitrary fill order as done in existing patch-based approaches [9, 19]. Our fill order is function of image properties, resulting in an organic synthesis process that eliminates the risk of “broken-structure” artefacts (fig. 7c) and also reduces blocky artefacts without an expensive patch-cutting step [9] or a blur-inducing blending step [19]. 

由於目標區域的填充順序完全由優先級函數P(p)決定,因此我們避免了現有基於補丁的方法中預先定義任意填充順序的缺陷[9,19]。我們的填充順序是通過圖像優先級的函數得到的,從而產生一個有組織的合成過程,消除了“結構破壞”的風險(圖7c),並且還在沒有塊切割步驟[9]或模糊步驟[19]的情況下,減少了塊效應。 

3.2. Propagating texture and structure information.

Once all priorities on the fill front have been computed, the patch Ψp^ with highest priority is found. We then fill it with data extracted from the source region Φ.

In traditional inpainting techniques, pixel-value information is propagated via diffusion. As noted previously, diffusion necessarily leads to image smoothing, which results in blurry fill-in, especially of large regions (see fig. 10f).

On the contrary, we propagate image texture by direct sampling of the source region. Similar to [10], we search in the source region for that patch which is most similar to Ψp^. Formally,

3.2.傳播紋理和結構信息。

計算完填充面上的所有優先級後,即可找到優先級最高的補丁Ψp^。然後我們用從源區域Φ提取的數據填充它。 

在傳統的着色技術中,像素值信息是通過擴散傳播的。如前所述,擴散必然導致圖像平滑,從而導致填充模糊,尤其是大區域(見圖10f)。 

相反,我們通過直接採樣源區域來傳播圖像紋理。與[10]類似,我們在源區域中搜索最類似於Ψp^.的補丁, 

where the distance d(Ψa,Ψb) between two generic patches Ψa and Ψb is simply defined as the sum of squared differences (SSD) of the already filled pixels in the two patches. We use the CIE Lab colour space because of its property of perceptual uniformity [18].

Having found the source exemplar Ψˆq, the value of each pixel-to-be-filled, p |p ∈ Ψˆp∩Ω, is copied from its corresponding position inside Ψˆq.

This suffices to achieve the propagation of both structure and texture information from the source Φ to the target region Ω, one patch at a time (cf., fig. 2d). In fact, we note that any further manipulation of the pixel values (e.g., adding noise, smoothing and so forth) that does not explicitly depend upon statistics of the source region, is far more likely to degrade visual similarity between the filled region and the source region, than to improve it.

其中,兩個普通面片之間的距離d(ψa,ψb)定義爲兩個面片中已填充像素的平方差(ssd)之和。我們使用CIE實驗室顏色空間是因爲它具有感知均勻性[18]。 

找到源示例_q後,將從其在_q內的相應位置複製要填充的每個像素的值p p∈_pΩ。

這足以實現結構和紋理信息從源Φ到目標區域Ω的傳播,一次一個圖像塊(參見圖2d)。事實上,我們注意到,對像素值的任何進一步不顯式依賴源區域的統計信息的操作(例如,添加噪聲、平滑等),更可能的操作是降低填充區域和源區域之間的視覺相似性,而不是改善它。 

3.3. Updating confidence values.

After the patch Ψˆp has been filled with new pixel values, the confidence C(p) is updated in the area delimited by Ψˆp as follows:

3.3.更新置信值。 

補丁Ψˆp被新的像素值填充後,置信度C(p)在Ψˆp界定的區域更新如下: 

This simple update rule allows us to measure the relative confidence of patches on the fill front, without imagespecific parameters. As filling proceeds, confidence values decay, indicating that we are less sure of the colour values of pixels near the centre of the target region.

這個簡單的更新規則允許我們在沒有圖像特定參數的情況下測量填充面上補丁的相對置信度。隨着填充過程的進行,置信值會衰減,這表明我們對目標區域中心附近像素的顏色值不太確定。 

A pseudo-code description of the algorithmic steps is shown in table 1. The superscript t indicates the current iteration.

算法步驟的僞代碼描述如表1所示。上標t表示當前迭代。 

4. Results and comparisons

Here we apply our algorithm to a variety of images, ranging from purely synthetic images to full-colour photographs that include complex textures. Where possible, we make side-by-side comparisons to previously proposed methods. In other cases, we hope the reader will refer to the original source of our test images (many are taken from previous literature on inpainting and texture synthesis) and compare these results with the results of earlier work.

In all of the experiments, the patch size was set to be greater than the largest texel or the thickest structure (e.g., edges) in the source region. Furthermore, unless otherwise stated the source region has been set to be Φ = I −Ω. All experiments were run on a 2.5GHz Pentium IV with 1GB of RAM.

The Kanizsa triangle. We perform our first experiment on the well-known Kanizsa triangle [17] to show how the algorithm works on a structure-rich synthetic image.

As shown in fig. 4, our algorithm deforms the fill front δΩ under the action of two forces: isophote continuation (the data term, D(p)) and the “pressure” from surrounding filled pixels (the confidence term, C(p)).

4.結果和比較 

在這裏,我們將我們的算法應用於各種圖像,從純合成圖像到包含複雜紋理的全綵照片。在可能的情況下,我們將與先前提出的方法進行比較。在其他情況下,我們希望讀者參考我們測試圖像的原始來源(許多是從以前的關於圖像修復和紋理合成的文獻中獲取的),並將這些結果與早期工作的結果進行比較。 

在所有的實驗中,圖像塊大小被設置爲大於源區域中最大的紋理元素或最厚的結構(例如邊緣)。此外,除非另有說明,否則源區設置爲Φ=I−Ω。所有的實驗都是在一個2.5GHz的Pentium IV上運行的,內存爲1GB。 

Kanizza三角。我們對著名的Kanizza三角形[17]進行了第一次實驗,以展示該算法如何在結構豐富的合成圖像上工作。 

如圖4所示,我們的算法在兩種函數的作用下使填充前δΩ變形:等壓線連續(數據項D(p))和周圍填充像素的“壓力”(置信項C(p))。

The sharp linear structures of the incomplete green triangle are grown into the target region. But also, no single structural element dominates all of the others; this balance among competing isophotes is achieved through the naturally decaying confidence values (in an earlier version of our algorithm which lacked this balance, “runaway” structures led to large-scale artefacts.)

Figures 4e,f also show the effect of the confidence term in smoothing sharp appendices such as the vertices of the target region (in red).

As described above, the confidence is propagated in a manner similar to the front-propagation algorithms used in inpainting. We stress, however, that unlike inpainting, it is the confidence values that are propagated along the front (and which determine fill order), not colour values themselves, which are sampled from the source region.

Finally, we note that despite the large size of the removed region, edges and lines in the filled region are as sharp as any found in the source region. There is no blurring from diffusion processes. This is a property of exemplar-based texture synthesis.

將不完全綠色三角形的尖銳線性結構擴展到目標區域。但是,沒有一個單一的結構元素支配所有其他元素;這種在競爭的等壓線之間的平衡是通過自然衰減的置信值實現的(在缺乏這種平衡的早期版本中,“失控”結構導致了大規模的人工製品)。 

圖4e,f還顯示了置信值對平滑尖銳部分(如目標區域頂點)的影響(紅色)。

如上所述,置信度的傳播方式類似於修復中使用的前傳播算法。然而,我們強調,與圖像修復方法不同的是,置信值是沿着前面傳播的(並決定填充順序),而不是顏色值本身,它們是從源區域取樣的。

最後,我們注意到,儘管移除區域的大小很大,填充區域中的邊和線與源區域中的任何邊和線一樣鋒利。擴散過程沒有模糊。這是基於範例的紋理合成的一個特性。

The effect of different filling strategies. Figures 5, 6 and 7 demonstrate the effect of different filling strategies.

Figure 5f shows how our filling algorithm achieves the best structural continuation in a simple, synthetic image.

Figure 6 further demonstrates the validity of our algorithm on an aerial photograph. The 40 × 40-pixel target region has been selected to straddle two different textures (fig. 6b). The remainder of the 200 × 200 image in fig. 6a was used as source for all the experiments in fig. 6.

With raster-scan synthesis (fig. 6c) not only does the top region (the river) grow into the bottom one (the city area), but visible seams also appear at the bottom of the target region. This problem is only partially addressed by a concentric filling (fig 6d). Similarly, in fig. 6e the sophisticated ordering proposed by Harrison [13] only moderately succeeds in preventing this phenomenon.

In all of these cases, the primary difficulty is that since the (eventual) texture boundary is the most constrained part of the target region, it should be filled first. But, unless this is explicitly addressed in determining the fill order, the texture boundary is often the last part to be filled. The algorithm proposed in this paper is designed to address this problem, and thus more naturally extends the contour between the two textures as well as the vertical grey road.

不同填充策略的效果。圖5、6和7展示了不同填充策略的效果。 

圖5f顯示了我們的填充算法如何在簡單的合成圖像中實現最佳的結構延續。

圖6進一步證明了我們的算法在航空照片上的有效性。選擇40×40像素的目標區域跨越兩種不同的紋理(圖6b)。圖6a中200×200圖像的其餘部分用作圖6中所有實驗的源區域。

通過光柵掃描合成(圖6c),不僅頂部區域(河流)生長到底部區域(城市區域),而且目標區域底部也會出現可見接縫。這一問題只能通過同心填補來部分解決(圖6d)。同樣,在圖6e中,Harrison[13]提出的複雜排序僅在一定程度上成功地防止了這種現象。 

在所有這些情況下,主要的困難是,由於(最終)紋理邊界是目標區域中最受約束的部分,所以應該首先填充它。但是,除非在確定填充順序時明確說明這一點,否則紋理邊界通常是要填充的最後一部分。本文提出的算法是針對這一問題而設計的,從而更自然地擴展了兩種紋理之間的輪廓以及垂直的灰色道路。 

In the example in fig. 6, our algorithm fills the target region in only 2 seconds, on a Pentium IV, 2.52GHz, 1GB RAM. Harrison’s resynthesizer [13], which is the nearest in quality, requires approximately 45 seconds.

Figure 7 shows yet another comparison between the concentric filling strategy and the proposed algorithm. In the presence of concave target regions, the “onion peel” filling may lead to visible artefacts such as unrealistically broken structures (see the pole in fig. 7c). Conversely, the presence of the data term of (1) encourages the edges of the pole to grow “first” inside the target region and thus correctly reconstruct the complete pole (fig. 7d). This example demonstrates the robustness of the proposed algorithm with respect to the shape of the selected target region.

Comparisons with inpainting. We now turn to some examples from the inpainting literature. The first two examples show that our approach works at least as well as inpainting.

The first (fig. 8) is a synthesized image of two ellipses [4]. The occluding white torus is removed from the input image and two dark background ellipses reconstructed via our algorithm (fig. 8b). This example was chosen by authors of the original work on inpainting to illustrate the structure propagation capabilities of their algorithm. Our results are visually identical to those obtained by inpainting ([4], fig.4).

We now compare results of the restoration of an handdrawn image. In fig. 9 the aim is to remove the foreground text. Our results (fig. 9b) are mostly indistinguishable with those obtained by traditional inpainting 3. This example demonstrates the effectiveness of both techniques in image restoration applications.

It is in real photographs with large objects to remove, however, that the real advantages of our approach become apparent. Figure 10 shows an example on a real photograph, of a bungee jumper in mid-jump (from [4], fig.8). In the original work, the thin bungee cord is removed from the image via inpainting. In order to prove the capabilities of our algorithm we removed the entire bungee jumper (fig. 10e). Structures such as the shore line and the edge of the house have been automatically propagated into the target region along with plausible textures of shrubbery, water and roof tiles; and all this with no a priori model of anything specific to this image.

在圖6中的示例中,我們的算法在Pentium IV,2.52GHz、1GB RAM上,只需2秒鐘就可以填充目標區域。哈里森的再合成器[13]的質量最接近,大約需要45秒。

圖7顯示了同心填充策略和對比算法之間的另一個比較。在存在凹面目標區域的情況下,“洋蔥皮”填充物可能導致可見的假象,例如不真實的斷裂結構(見圖7c中的杆)。相反,第一次迭代(?)的數據項的存在鼓勵極點的邊緣向目標區域內生長,從而正確地重建整個極點(圖7d)。這個例子說明了所提出的算法對所選目標區域形狀的魯棒性。 

與圖像修復比較。現在,我們來看一些來自於圖像修復文獻的例子。前兩個例子表明,我們的方法至少能起到修復的作用。

第一個(圖8)是兩個橢圓的合成圖像[4]。通過我們的算法(圖8b),從輸入圖像中去除阻塞的白色圓環,重建兩個暗背景橢圓。這個例子是由最初的圖像修復工作的作者選擇的,以說明他們的算法的結構傳播能力。我們的結果在視覺上與通過圖像修復獲得的結果相同([4],圖4)。 

我們現在比較了手工繪製圖像的恢復結果。圖9的目的是刪除前景文本。我們的結果(圖9b)與傳統的圖像修復算法所得的結果幾乎不可區分。這個例子演示了這兩種技術在圖像恢復應用中的有效性。 

然而,在大量物體的真實照片中證實,我們的方法的真正優勢變得顯而易見。圖10顯示了一張真實照片上的例子,一個蹦極運動員在跳中(從[4],圖8)。在最初的作品中,薄的橡皮筋繩通過內塗從圖像中去除。爲了證明我們的算法的能力,我們移除了整個蹦極跳線(圖10e)。諸如海岸線和房屋邊緣等結構已經自動傳播到目標區域,以及灌木、水和屋頂瓷磚的合理紋理;所有這一切都沒有針對這幅圖像的先驗模型。 

For comparison, figure 10f shows the result of filling the same target region (fig. 10b) by image inpainting. Considerable blur is introduced into the target region because of inpainting’s use of diffusion to propagate colour values; and high-frequency textural information is entirely absent.

Figure 11 compares our algorithm to the recent “texture and structure inpainting” technique described in [5]. Figure 11(bottom right) shows that also our algorithm accomplishes the propagation of structure and texture inside the selected target region. Moreover, the lack of diffusion steps avoids blurring propagated structures (see the vertical edge in the encircled region) and makes the algorithm more computationally efficient.

Synthesizing composite textures. Fig. 12 demonstrates that our algorithm behaves well also at the boundary between two different textures, such as the ones analyzed in [23]. The target region selected in fig. 12c straddles two different textures. The quality of the “knitting” in the contour reconstructed via our approach (fig. 12d) is similar to the original image and to the results obtained in the original work (fig. 12b), but again, this has been accomplished without complicated texture models or a separate boundaryspecific texture synthesis algorithm.

Further examples on photographs. We show two more examples on photographs of real scenes.

Figure 13 demonstrates, again, the advantage of the proposed approach in preventing structural artefacts (cf., 7d). While the onion-peel approach produces a deformed horizon, our algorithm reconstructs the boundary between sky and sea as a convincing straight line. 

Finally, in fig. 14, the foreground person has been manually selected and the corresponding region filled in automatically. The filled region in the output image convincingly mimics the complex background texture with no prominent artefacts (fig. 14f). During the filling process the topological changes of the target region are handled effortlessly.

爲了進行比較,圖10f顯示了通過圖像修復填充相同目標區域(圖10b)的結果。由於圖像修復使用擴散來傳播顏色值,因此在目標區域引入了相當大的模糊;並且完全沒有高頻紋理信息。 

圖11將我們的算法與[5]中描述的最新“紋理和結構修復”技術進行了比較。圖11(右下角)顯示,我們的算法還可以在選定的目標區域內完成結構和紋理的傳播。此外,由於缺乏擴散步驟,避免了傳播結構的模糊(參見環繞區域的垂直邊緣),使算法的計算效率更高。 

合成複合紋理。圖12表明,我們的算法在兩種不同紋理之間的邊界也表現良好,如[23]中分析的紋理。圖12c中選擇的目標區域跨越兩種不同的紋理。通過我們的方法重建的輪廓中的“編織”質量(圖12d)與原始圖像和原始工作中(圖12b)相似,但同樣,這是在沒有複雜的紋理模型或單獨的邊界特定紋理合成算法的情況下完成的。 

關於照片的更多例子。我們在真實場景的照片上再展示兩個例子。 

圖13再次證明了建議的方法在防止結構人工製品方面的優勢(參見,7d)。當洋蔥皮方法產生一個變形的地平線時,我們的算法將天空和海洋之間的邊界重建爲一條令人信服的直線。

最後,在圖14中,手動選擇了前景人物,並自動填寫相應的區域。輸出圖像中的填充區域令人信服地模仿了複雜的背景紋理,沒有突出的人工痕跡(圖14f)。在填充過程中,對目標區域的拓撲變化進行了簡單的處理。 

5. Conclusion and future work

This paper has presented a novel algorithm for removing large objects from digital photographs. The result of object removal is an image in which the selected object has been replaced by a visually plausible background that mimics the appearance of the source region.

Our approach employs an exemplar-based texture synthesis technique modulated by a unified scheme for determining the fill order of the target region. Pixels maintain a confidence value, which together with image isophotes, influence their fill priority.

The technique is capable of propagating both linear structure and two-dimensional texture into the target region. Comparative experiments show that a careful selection of the fill order is necessary and sufficient to handle this task.

Our method performs at least as well as previous techniques designed for the restoration of small scratches, and in instances in which larger objects are removed, it dramatically outperforms earlier work in terms of both perceptual quality and computational efficiency.

Currently, we are investigating extensions for more accurate propagation of curved structures in still photographs and for object removal from video, which promise to impose an entirely new set of challenges.

5. 總結和後續工作

本文提出了一種從數字圖像中去除大物體的新算法。去除遮擋物的結果是得到一個圖像,該圖像中所選物體已被一個視覺上可信的背景所替換。 

我們的方法採用了一種基於範例的紋理合成技術,並通過統一的方案來確定目標區域的填充順序。像素的置信值與圖像等焦線一起影響其填充的優先級。 

該技術能夠將線性結構和二維紋理傳播到目標區域。對比實驗表明,對填充順序進行仔細的選擇是處理這一任務的必要和充分的。 

我們的方法在修復小劃痕上和以前設計的技術一樣;在移除較大物體的情況下,在感知質量和計算效率方面,它明顯優於早期的工作。 

目前,我們正在研究在靜止照片中更精確地傳播曲線結構以及從視頻中去除物體的擴展,這將帶來一系列全新的挑戰。 

參考文獻

[1] M. Ashikhmin. Synthesizing natural textures. In Proc. ACM Symp. on Interactive 3D Graphics, pp. 217–226, Research Triangle Park, NC, Mar 2001.

[2] C. Ballester, V. Caselles, J. Verdera, M. Bertalmio, and G. Sapiro. A variational model for filling-in gray level and color images. In Proc. ICCV, pp. I: 10–16, Vancouver, Canada, Jun 2001.

[3] M. Bertalmio, A.L. Bertozzi, and G. Sapiro. Navier-stokes, fluid dynamics, and image and video inpainting. In Proc. Conf. Comp. Vision Pattern Rec., pp. I:355–362, Hawai, Dec 2001.

[4] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester. Image inpainting. In Proc. ACM Conf. Comp. Graphics
(SIGGRAPH), pp. 417–424, New Orleans, LU, Jul 2000. http://mountains.ece.umn.edu/∼guille/inpainting.htm.

[5] M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. Simultaneous structure and texture image inpainting. to appear,
2002. http://mountains.ece.umn.edu/∼guille/inpainting.htm.

[6] R. Bornard, E. Lecan, L. Laborelli, and J-H. Chenot. Missing data correction in still images and image sequences. In ACMMultimedia, France, Dec 2002.

[7] T. F. Chan and J. Shen. Non-texture inpainting by curvature-driven diffusions (CDD). J. Visual Comm. Image Rep., 4(12), 2001.

[8] J.S. de Bonet. Multiresolution sampling procedure for analysis and synthesis of texture images. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), volume 31, pp. 361–368, 1997.

[9] A. Efros and W.T. Freeman. Image quilting for texture synthesis and transfer. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), pp. 341–346, Eugene Fiume, Aug 2001.

[10] A. Efros and T. Leung. Texture synthesis by non-parametric sampling. In Proc. ICCV, pp. 1033–1038, Kerkyra, Greece, Sep 1999.

[11] W.T. Freeman, E.C. Pasztor, and O.T. Carmichael. Learning lowlevel vision. Int. J. Computer Vision, 40(1):25–47, 2000.

[12] D. Garber. Computational Models for Texture Analysis and Texture Synthesis. PhD thesis, Univ. of Southern California, USA, 1981.

[13] P. Harrison. A non-hierarchical procedure for re-synthesis of complex texture. In Proc. Int. Conf. Central Europe Comp. Graphics, Visua. and Comp. Vision, Plzen, Czech Republic, Feb 2001.

[14] D.J. Heeger and J.R. Bergen. Pyramid-based texture analysis/synthesis. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), volume 29, pp. 229–233, Los Angeles, CA, 1995.

[15] A. Hertzmann, C. Jacobs, N. Oliver, B. Curless, and D. Salesin. Image analogies. In Proc. ACM Conf. Comp. Graphics (SIGGRAPH), Eugene Fiume, Aug 2001.

[16] H. Igehy and L. Pereira. Image replacement through texture synthesis. In Proc. Int. Conf. Image Processing, pp. III:186–190, 1997.

[17] G. Kanizsa. Organization in Vision. Praeger, New York, 1979.

[18] J. M. Kasson and W. Plouffe. An analysis of selected computer interchange color spaces. In ACM Transactions on Graphics, volume 11, pp. 373–405, Oct 1992.

[19] L. Liang, C. Liu, Y.-Q. Xu, B. Guo, and H.-Y. Shum. Real-time texture synthesis by patch-based sampling. In ACM Transactions on Graphics, 2001.

[20] S. Masnou and J.-M. Morel. Level lines based disocclusion. In Int. Conf. Image Processing, Chicago, 1998.

[21] S. Rane, G. Sapiro, and M. Bertalmio. Structure and texture fillingin of missing image blocks in wireless transmission and compression applications. In IEEE. Trans. Image Processing, 2002. to appear.

[22] L.-W. Wey and M. Levoy. Fast texture synthesis using treestructured vector quantization. In Proc. ACMConf. Comp. Graphics (SIGGRAPH), 2000.

[23] A. Zalesny, V. Ferrari, G. Caenen, and L. van Gool. Parallel composite texture synthesis. In Texture 2002 workshop - (in conjunction with ECCV02), Copenhagen, Denmark, Jun 2002.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章