Skip to content

Latest commit

 

History

History
317 lines (255 loc) · 9.89 KB

File metadata and controls

317 lines (255 loc) · 9.89 KB

Image generation


Refer to official documentation.

Dall-e-3 model

Generation of an image using dall-e-3.

//uses GenAI, GenAI.Types, GenAI.Tutorial.VCL;

  TutorialHub.FileName := 'Dalle3_01.png';

  //Asynchronous example
  Client.Images.AsynCreate(
    procedure (Params: TImageCreateParams)
    begin
      Params.Model('dall-e-3');
      Params.Prompt('A quarter dollar on a wooden floor close up.');
      Params.N(1);
      Params.Size('1024x1024');
      Params.Style('vivid');
      Params.ResponseFormat(TResponseFormat.url);
    end,
    function : TAsynGeneratedImages
    begin
      Result.Sender := TutorialHub;
      Result.OnStart := Start;
      Result.OnSuccess := Display;
      Result.OnError := Display;
    end);

  //Synchronous example
//  var Value := Client.Images.Create(
//    procedure (Params: TImageCreateParams)
//    begin
//      Params.Model('dall-e-3');
//      Params.Prompt('A quarter dollar on a wooden floor close up.');
//      Params.N(1);
//      Params.Size('1024x1024');
//      Params.Style('vivid');
//      Params.ResponseFormat(url);
//      TutorialHub.JSONResponse := Value.JSONResponse;
//    end);
//  try
//    Display(TutorialHub, Value);
//  finally
//    Value.Free;
//  end;

  //Asynchronous promise example
//  var Promise := Client.Images.AsyncAwaitCreate(
//    procedure (Params: TImageCreateParams)
//    begin
//      Params.Model('dall-e-3');
//      Params.Prompt('A quarter dollar on a wooden floor close up.');
//      Params.N(1);
//      Params.Size('1024x1024');
//      Params.Style('vivid');
//      Params.ResponseFormat(TResponseFormat.url);
//    end
//  );
//
//  Promise
//    .&Then<TGeneratedImages>(
//      function (Value: TGeneratedImages): TGeneratedImages
//      begin
//        Result := Value;
//        Display(TutorialHub, Value);
//      end)
//    .&Catch(
//      procedure (E: Exception)
//      begin
//        Display(TutorialHub, E.Message);
//      end);

Let’s take a closer look at how the Display method handles output to understand how the model’s response is managed.

procedure Display(Sender: TObject; Value: TGeneratedImages);
begin
  {--- Load image when url is not null. }
  if not TutorialHub.FileName.IsEmpty then
    begin
      if not Value.Data[0].Url.IsEmpty then
        Value.Data[0].Download(TutorialHub.FileName) else
        Value.Data[0].SaveToFile(TutorialHub.FileName);
    end;

  {--- Load image into a stream }
  var Stream := Value.Data[0].GetStream;
  try
    {--- Display the JSON response. }
    TutorialHub.JSONResponse := Value.JSONResponse;

    {--- Display the revised prompt. }
    Display(Sender, Value.Data[0].RevisedPrompt);

    {--- Load the stream into the TImage. }
    TutorialHub.Image.Picture.LoadFromStream(Stream);
  finally
    Stream.Free;
  end;
end;

GenAI offers optimized methods for managing image responses generated by the model. The SaveToFile, Download, and GetStream methods enable efficient handling of the received image content.



Gpt-image-1 model

Since May 5, 2025, OpenAI has offered the gpt-image-1 model for image creation and editing. This new model delivers higher quality compared to dall-e-2 and dall-e-3.

In the configuration, you now have four additional parameters for image generation:

  • background: Allows you to set the transparency of the generated image’s background. Only supported by gpt-image-1. Must be one of transparent, opaque, or auto (default).

  • moderation: Controls the content-moderation level for images generated by gpt-image-1. Must be either low (less restrictive filtering) or auto (default).

  • output_compression: Specifies the compression level (0–100%) for the generated images. Only supported by gpt-image-1 when using the webp or jpeg output formats; defaults to 100.

  • output_format: Determines the format in which generated images are returned. Only supported by gpt-image-1. Must be one of png, jpeg, or webp.


Additionally, several existing parameters have been extended with new values for gpt-image-1:

quality: Supports high, medium, and low.

size: Supports 1536×1024 (landscape), 1024×1536 (portrait), or auto (default).

prompt: Allows up to 32,000 characters for gpt-image-1 (versus 1,000 for dall-e-2 and 4,000 for dall-e-3).


An example of image creation with gpt-image-1 (Asynchronous because response times are much longer):

//uses GenAI, GenAI.Types, GenAI.Tutorial.VCL;

  TutorialHub.FileName := 'GptImage1.png';

  //Increased reception timeout (ms) as the model takes longer
  Client.API.HttpClient.ResponseTimeout := 120000;

  //Asynchronous example
  Client.Images.AsynCreate(
    procedure (Params: TImageCreateParams)
    begin
      Params.Model('gpt-image-1'); //'dall-e-3');
      Params.Prompt('A realistic photo of a coffee cup with saucer on a transparent background');
      Params.N(1);
      Params.Size('1536x1024');
      Params.BackGround('transparent');
      Params.Moderation('low');
      Params.OutputFormat('png');
      Params.Quality('high');
      TutorialHub.JSONRequest := Params.ToFormat();
    end,
    function : TAsynGeneratedImages
    begin
      Result.Sender := TutorialHub;
      Result.OnStart := Start;
      Result.OnSuccess := Display;
      Result.OnError := Display;
    end);

  //Asynchronous promise example
//  var Promise := Client.Images.AsyncAwaitCreate(
//    procedure (Params: TImageCreateParams)
//    begin
//      Params.Model('gpt-image-1');
//      Params.Prompt('A realistic photo of a coffee cup with saucer on a transparent background');
//      Params.N(1);
//      Params.Size('1536x1024');
//      Params.BackGround('transparent');
//      Params.Moderation('low');
//      Params.OutputFormat('png');
//      Params.Quality('high');
//    end
//  );
//
//  Promise
//    .&Then<TGeneratedImages>(
//      function (Value: TGeneratedImages): TGeneratedImages
//      begin
//        Result := Value;
//        Display(TutorialHub, Value);
//      end)
//    .&Catch(
//      procedure (E: Exception)
//      begin
//        Display(TutorialHub, E.Message);
//      end);


Note

We can notice in the returned JSON the usage values ​​which are not provided with the dall-e-2 and dall-e-3 models.


Create image edit with gpt-image-1

Previously, I hadn’t gone into detail about the image-editing process, because the only model available at OpenAI—DALL·E 2—produced rather unconvincing results.

However, with gpt-image-1, the output quality is significantly higher.

To perform an edit:

  1. Prepare your base image

    • Open the image you wish to modify and erase the area to be reworked using a transparency tool (brush or selection).
  2. Generate the mask

    • The erased (transparent) region becomes the mask that you’ll supply to the model.
  3. Compose your extended prompt

    • In your request, describe exactly what the model should insert into the masked area. You now have up to 32,000 tokens for a fully detailed description.
  4. Execute the edit

    • Provide the model with both the masked image and your prompt; it will then know precisely where and how to apply the changes.

Below, you’ll find an example of the code to send to gpt-image-1 to initiate the edit.


  TutorialHub.FileName := 'Image-gpt-edit.png';

  //Increased reception timeout (ms) as the model takes longer
  Client.API.HttpClient.ResponseTimeout := 120000;

  //Asynchronous example
  Client.Images.AsynEdit(
    procedure (Params: TImageEditParams)
    begin
      Params.Model('gpt-image-1');
      Params.Image('Dalle05.png');          //<--- Unmodified image
      Params.Mask('Dalle05Mask.png');       //<--- Modified image with masked part
      Params.Prompt('Add a pink elephant'); //<--- Replace the mask by building this
      Params.Size('1024x1024');
      TutorialHub.JSONRequest := Params.ToFormat();
    end,
    function : TAsynGeneratedImages
    begin
      Result.Sender := TutorialHub;
      Result.OnStart := Start;
      Result.OnSuccess := Display;
      Result.OnError := Display;
    end);

  //Asynchronous promise example
//  var Promise := Client.Images.AsyncAwaitEdit(
//    procedure (Params: TImageEditParams)
//    begin
//      Params.Model('gpt-image-1');
//      Params.Image('Dalle05.png');          //<--- Unmodified image
//      Params.Mask('Dalle05Mask.png');       //<--- Modified image with masked part
//      Params.Prompt('Add a pink elephant'); //<--- Replace the mask by building this
//      Params.Size('1024x1024');
//    end);
//
//  Promise
//    .&Then<TGeneratedImages>(
//      function (Value: TGeneratedImages): TGeneratedImages
//      begin
//        Result := Value;
//        Display(TutorialHub, Value);
//      end)
//    .&Catch(
//      procedure (E: Exception)
//      begin
//        Display(TutorialHub, E.Message);
//      end);

  • Result with the hidden section:
Result after editing: